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I. INTRODUCTION 


In practice, the age of a device is often measured in more than one time scale. For 
example, automobiles age in the “parallel” scales of calendar time since purchase and 
number of miles driven. As such, routine engine maintenance depends on both of these 
scales: an oil change is recommended every three months or 3,000 miles, whichever 
comes first. For some devices, the scale most relevant for maintenance is clear. For 
example, Kordonsky and Gertsbakh (1993) note that for a jet engine turbine, the duration 
of the warmup period is the most relevant (of several possible scales) but for the 
undercarriage of an aircraft, the number of landings is most relevant. For other devices, 
however, the most relevant scale for maintenance is difficult to determine. For example, 
the joint between an aircraft wing and the fuselage is subjected simultaneously to 
corrosion (thus the scale “calendar time” is relevant), landing stresses (thus number of 
landings is relevant), and level flight stresses (thus total flight time is relevant), as noted 
by Kordonsky and Gertsbakh (1993). In any case, a maintenance policy should take into 
account the parallel scales in which an item operates. In a military setting, attempts are 
made to model the effect of chronological or operational time on the failure 
characteristics of a military device during the developmental testing phase. During this 
phase, however, it may be difficult or impossible to accurately model the effect of usage 
on the device resulting from military missions. Thus, classical failure models are used to 
develop single-scale maintenance policies, even though it is well known that the device 
will operate in the parallel scales of chronological (or operational) time and number of 
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missions. Lifetime data including the total number of missions (e.g., landings) accrued at 
the time of device failure may become available later in the acquisition cycle, such as 
during operational testing or upon initial fielding. Military maintenance costs should be 
reduced by using policies that directly account for aging in multiple scales. In this 
dissertation we focus on developing, optimizing, and estimating maintenance policies, in 
particular age replacement policies, based on multiple time scales. 

A. SINGLE-SCALE AGE REPLACEMENT POLICIES 

The vast majority of methods for developing maintenance policies are based on a 
single time scale; see McCall (1965), Pierskalla and Voelker (1976), and Valdez-Flores 
and Feldman (1989) for comprehensive reviews. Among the most useful and most 
studied are age replacement policies, under which a device is replaced (or overhauled) at 
failure or at a predetermined age t> 0, whichever occurs first. Let X be a positive random 
variable (r.v.) representing the lifetime of a device, i.e., the time when the device fails. 

Let X have distribution function F; following Bather (1977) it will be convenient to 
define F(x) = P(X < x) and the survivor function as S(x) = P(X > x). Thus, under an age 
replacement policy, a device is replaced with a new one at time min{X,T}. Let the cost 
for replacement be K > 0 if the device is replaced due to age (i.e., preventively, since 
X > r) and K + C if it is replaced due to failure (i.e., X < r), where the additional cost of 
replacement at failure is C > 0. If devices have independent lifetimes, then replacement 
times occur according to a renewal process. From the Renewal Reward Theorem (e.g., 
Ross, 1997), the long-run average cost per unit of time that the device is in use is 
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( 1 . 1 ) 


C(T) = 


K + CF{t) 
\ T Q S{u)du 


T> 0 . 


A complete derivation of this expression can be found in Appendix A. If F is absolutely 
continuous and has an increasing failure rate (IFR). then C(t) has at most one minimum. 
In addition, if the failure rate is continuous and strictly increasing to there exists a 
unique and finite value T* minimizing C(r) (e.g., Barlow & Proschan, 1965). Bergman 
(1982) shows that a unique, finite f* is attained under slightly less restrictive conditions. 

When F is completely specified, 7* can be found explicitly, but is more often 
found with numerical methods. Glasser (1967) uses numerical methods to obtain charts 
which can be used to find t* when F is truncated normal, gamma, or Weibull. When F is 
unknown, there are numerous approaches available for estimating t* based upon lifetime 
data. In most of these approaches, F in equation (1.1) is replaced with an estimator F 
based upon the data. This results in an estimator C(r)of the cost function C(r); 7* is 

then estimated by minimizing C(r). For example, given a simple random sample 

X n , of lifetimes from F, non-parametric estimators of C(r) and 7* can be found 
using the empirical survivor function 


S< r > = X;„/(X,>r)/H, 


( 1 . 2 ) 


where I(X y . > t) -1 if X - > t and 0 otherwise. It follows that the estimator of C(t) is 


C(T) = 


(K + C)-CS{t) 
j T J{u)du 


T> 0. 


(1.3) 
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From the definition of S(r ), it is seen that C(r) is lower semi-continuous with 
denominator strictly increasing on (0,°°) and numerator a lower semi-continuous step 
function constant between observations. As a result, local minima of C(t) are found at 
the observations and we define f = argmin C(X ; ). Also, f is not necessarily unique. 
Arunkumar (1972) proves thatC(r) and fare strongly consistent estimators of C(r) and 
**, respectively. Ingram and Scheaffer (1976) address estimation using the non- 
parametric maximum likelihood estimator (MLE) of F under the restriction of F having 
an increasing failure rate. The optimal policy f can also be estimated under other 
sampling schemes; for example, Kumar and Westberg (1997) estimate t* under right- 
censoring, and Bather (1977), Frees and Ruppert (1985), and Aras and Whitaker (1992) 
address sequential estimation of z*. Graphical approaches can also be used to minimize 
(1.1) and (1.3). Bergman (1977) uses the total time on test (TTT) plotting method of 
Barlow and Campo (1975) to estimate z*. This method is insightful since one can deduce 
ranges of the ratio K/C for which a particular t* is optimal. Two comprehensive 
treatments of this approach are contained in Bergman and Klefsjo (1982) and Klefsjo 
(1986). 

B. FAILURE MODELING IN MULTIPLE TIME SCALES 

Extending this theory so it can be used for maintenance of a device whose age is 
measured in multiple scales requires more than generalizing a univariate lifetime X to a 
multivariate lifetime, say (X,Y). This is not always immediately apparent. Confusion 
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arises because data used to estimate multiple-scale policies often appear to be of the form 
(X u Yi), (X 2 ,Y 2 ), ■ , (X n ,Yn). Nonetheless, the actual implementation of an age 

replacement policy requires that a device be tracked continuously through time. Even in 
a single scale, a policy cannot be implemented by observing the age at failure; the device 
is monitored through time so that it can be replaced at failure or time T, whichever comes 
first. The implementation of such a policy in more than one scale requires knowledge of 
the usage path, or “history” of the device; this notion is central to the literature of 
multiple time scales (e.g., Duchesne and Lawless (2000)). Let x > 0 denote the 
chronological time since introduction of a device into service, and let y(x) represent usage 
accumulated by the device up to age x (e.g., the total number of miles an automobile has 
been driven up to age x). The usage path of a device up to chronological time x is 
defined to be Z(x) = {(u,y(u)): 0 <«<*}. In addition, if the random variable X represents 
the chronological age of the device at failure and Y = y(X), then (X,Y) represents the time 
and cumulative usage at failure. In some cases a vector y(x) of various measures of usage 
is available (e.g., y,(x) could be the number of flight hours accrued as of chronological 
time x, and y 2 (x) could be the number of landings accrued as of chronological time x, 
etc.). Then, the usage path is Z(x) = {(uy(u)): 0 < u < x}. In most of what follows, 
however, we assume only a single measure of usage is available in order to simplify the 
presentation. Typically, a measure of usage is required to be both non-decreasing in x 
and an external covariate. The latter requirement (see Kalbfleisch and Prentice, 1980, 
Section 5.3) ensures the usage path Z is determined independently of the time to failure 

X. 
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Modeling the lifetime of a device whose failure depends upon the parallel effects 
of time and usage has received a great deal of attention in the past decade. Three main 
approaches are found in the literature. The first approach is to use a conditional model. 
Lawless et al (1995) model automobile warranty data by considering separately the 
distribution of X along each path Z and the distribution of the paths. The second is to use 
a joint model for failure times. This approach is taken by Singpurwalla and Wilson 
(1998), Murthy et al (1995), and Kordonsky and Gertsbakh (1994). Models built using 
this approach do not rely explicitly on the notion of a usage path. Due to the inherent 
complexity of explicitly modeling lifetimes and paths in multiple scales, much of the 
recent work in this area focuses on a third approach, that of finding appropriate methods 
for combining scales to form a single scale. When such a combined scale can be found, 
standard univariate reliability tools (including age replacement theory) can be brought to 
bear. Duchesne and Lawless (2000) unify and formalize all previous work in combining 
scales. 

C. MAINTENANCE IN MULTIPLE TIME SCALES 

Much less attention has been given to maintenance policies based on multiple 
scales. In the earliest work in this area, Nakagawa (1985) derives policies for devices 
that fail by either age or usage. He derives the expected cost rate C(rJV) of the policy 
under which a device is replaced at failure, at chronological age T, or at a discrete number 
N uses, whichever occurs first. In our setting, however, it is rarely evident whether 
failure occurred due to age or usage. In addition, since it is common to have both age and 
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usage continuous (e.g., scales might be chronological time since production and total 
flight time), we need models that allow usage to be continuous as well as discrete. 

Unlike Nakagawa (1985), most recent work focuses on finding an appropriate combined 
scale to be used for preventive maintenance. With this approach, the cost of age 
replacement can then be computed in the combined scale, and, under appropriate 
conditions, an optimal replacement age can also be found in that scale. The major work 
in this area is done by Kordonsky and Gertsbakh (1994) and along slightly different lines 
by Kordonsky and Gertsbakh (1993, 1995, 1997). They restrict attention to linear 
combined scales t(a) = (1 -d)x + ay(x), where a e [0,1]. Under an age replacement policy 
in such a scale, a device is replaced at age T(in the combined scale) or upon failure at age 
T(a) = (1 -a)X + aY, whichever occurs first. Most recently, Duchesne and Lawless (2000) 
propose an “ideal” time scale which generalizes some of the work of Kordonsky and 
Gertsbakh. Although not motivated specifically with preventive maintenance in mind, 
they suggest that their scale might be used for such purposes. The ideal scale is 
developed in order to capture chronological age and usage in such a way that, under 
appropriate conditions, the lifetime distribution of a device in this scale is independent of 
the path. Thus, in principle, an age replacement policy based on an ideal time scale could 

be used for devices regardless of their usage path. 

Because using combined scales reduces the problem of maintenance in multiple 
scales to that of maintenance in a single scale, it has the advantage of being tractable and 
easily understood. Combined scales, however, do not completely address the problems 
of maintenance in multiple scales. Absent from the literature is discussion of the 
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translation of policies developed in combined scales to policies in the original scales. 
Upon performing such a translation, it is clear that policies based on linear scales 
correspond to replacing devices if their joint failure time (X,Y) falls in the region 
M = {(x,j(x)): (1 -a)x + ay{x)< t) or when their usage curve crosses the boundary of this 
region, whichever occurs first. Similarly, policies based on an ideal time scale 
correspond to regions in the positive quadrant whose upper boundaries follow the 
contours of the ideal time scale. Considering such regions in the original scales suggests 
a more general class of policies that should be considered when searching for the optimal 
policy. Also absent from the literature are methods for comparing the cost of policies 
based on combined scales of different forms. The approach of Kordonsky and Gertsbakh 
(1994) does provide a means for comparing costs in the special case of the family of 
linear scales. As such, the need arises for a means to compare the cost of policies from a 
larger class of alternatives. 

In this dissertation we directly attack the problem of estimating optimal age 
replacement policies for devices with age measured in multiple scales in two different 
settings. In both, our focus is to search over a large class of sensible policies to minimize 
estimated long-run costs. To do so, we first define a class of multiple-scale policies 
which generalize policies found in previous works. In Chapter II, we use several real 
data sets to help develop insight into our choice of this class of potential policies. 

Because this class of policies is related to policies produced by combined scales, in 
Chapter III we review and discuss in detail how multiple-scale policies are obtained using 
the scale-combining approaches found in the literature. In this chapter we also discuss 
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how these policies fit into the framework established in Chapter II. In so doing, we raise 
significant concerns that reveal the need for new methods. Since usage paths are often 
well-approximated by straight lines, in Chapter IV we develop estimators of the cost 
function and optimal policy for the case in which devices age along linear usage paths. 

In Chapter V we discuss the large- and small-sample properties of these estimators and 
compare their performances with policies based on a common scale-combining approach 
In Chapter VI we develop a cost function for policies under a joint model for ( X,Y) and 
present numerical results obtained from solving the corresponding optimization problem 
for rectangular-shaped policies. In Chapter VII we highlight our contributions and 
present opportunities for further research. 
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II. EXTENDING AGE REPLACEMENT THEORY TO MULTIPLE TIME SCALES 

We seek to generalize the classical age replacement policy, under which a device 
is replaced at age Tor failure (whichever occurs first), to a policy based on age measured 
in multiple scales. The cost function used to define an optimal policy is based on the 
mechanism generating the failures. However, the general form of a sensible multiple- 
scale age replacement policy applies equally to many failure models. In this chapter, we 
introduce three data sets to help develop insight into an appropriate form for a multiple- 
scale age replacement policy. The data sets are chosen to represent situations for which 
either the conditional modeling approach or the joint modeling approach may be 
appropriate. In the first and third data sets, it is apparent that failures occur along fixed 
linear usage paths. In such a situation, an appropriate model is one which generates 
failures conditioned on the usage path and then utilizes a mixing distribution over the 
paths. However, in the second data set, there are no clear usage paths and the data are 
better modeled by a joint distribution. After considering the three data sets, we 
generalize the form of an age replacement policy to incorporate multiple time scales. 


A. INTRODUCTORY CASE STUDIES 

Under a single-scale policy with replacement time T, a device is replaced if it fails 

in the interval (0,r) or if its time in use (the one-dimensional equivalent of a usage path) 
crosses the right-most boundary of (0,r). As we generalize to the case of multiple scales, 
it will be convenient to identify a policy by the multiple-scale equivalent of the failure 
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replacement interval (0 ,t). This leads to consideration of policies defined by regions M. 
Here, a device is replaced if ( X,Y) is in M (i.e., upon failure) or when its usage path 
crosses the boundary of M, whichever occurs first. For now, we consider how such 
policies might be constructed based on observed bivariate failure times (*i,yi),..., (x n ,y n ). 
In what follows we use the notation R(x,y) to denote the rectangle (0,x) x (0 ,y). 

Case Study 1 

Consider policy M x = R( f ,°°), where f minimizes the empirical cost function 
(1.3) based on the first components X \,... , x n . Under this policy, we replace the device 
when its age reaches f or fails, whichever occurs first, regardless of the usage accrued. 
Although constructed in a rather naive manner, such a policy may be adequate in some 
cases. 

For example, consider the locomotive traction motor failure data in Singpurwalla 
and Wilson (1998). The data (see Appendix B) consists of the time since inception of 
service and mileage at failure of forty locomotive traction motors. Figure 2.1 shows a 
scatterplot of the failure data in the time scales number of days and number of miles and 
the regression fit through the origin. The coefficient of determination exceeds 99%. For 
these data, knowing the number of days at failure is almost equivalent to knowing the 
number of miles at failure since all exemplars have virtually identical usage rates (i.e., 
number of miles per day). Hence a “naive” policy based solely on chronological age 
suffices. Similarly, we could consider a mileage-based policy My = R(oo, y) where v 



minimizes (1.3) based on yu ...,y«. In fact, for ratios KJC > 0.25, the two regions 
Mx = R(°°,1200) and My = R(57304,<^) are based on the same observation, namely 
(1200,57304). 



Figure 2.1: Traction Motor Data with Regression Line. 
Triangles represent the number of days and miles until a failure occurred. 


Case Study 2 

A policy based on a single scale may not be satisfactory for lifetime data arising 
from devices having differing usage paths. Figure 2.2 shows a scatterplot of failure times 
of jet engines, discussed by Gertsbakh and Kordonsky (1998). This data set (see 
Appendix B) contains the flight hours and number of landings at failure of 21 Aeroflot jet 
engines. Unlike the first data set, the failures have occurred along several usage paths, 
and these paths are not provided or evident from Figure 2.2. Thus, knowing the number 
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of hours at failure is not equivalent to knowing the number of landings at failure. Hence, 
a policy based on only the flight hours at failure or only the number of landings at failure 
is likely to ignore information that could potentially reduce maintenance costs. In fact, 
for K/C = 0.5, Mx = R(4932,°°) and My = R(°°, 1152); these two policies (with boundaries 
delimited by the overlaid dashed lines in Figure 2.2) are based on the vastly “different” 
observations (4932, 1960) and (3227,1152), respectively. 
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Figure 2.2: Jet Engine Data. 

Triangles represent the number of flight hours and landings until a failure 
occurred. The vertical and horizontal lines represent the boundaries of, 
respectively, a policy triggered solely by the number of flight hours at 
failure and the policy triggered solely by the number of landings at failure. 


Such policies, however, are often used. Gertsbakh and Kordonsky (1997) note 
that a single distribution is often fit to lifetime data arising from devices operating in 
heterogeneous environments. An “optimal” policy is estimated from this distribution and 
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applied to the entire population. Policies of this form ignore the bivariate nature of the 
failure data. For example, under policy Mx, devices with lifetimes (x,y) and (x,2y) are 
treated in the same manner, even though the latter device is “older” in some sense than 
the former. A policy which somehow incorporates the additional information contained 
in the paired failure times seems “better” than Mx- Consider the policy Mxy = R( f, v), 
formed by combining Mx and My. Under policy Mxy we replace a non-failed device 
when it accrues either age f or usage v, whichever occurs first; f and v are estimates of 
the optimal replacement times in the two single-scale age replacement problems. Policy 
Mxy seems to be an improvement over both Mx and My, since it is based on all the data 
and since in some cases (for example) devices with lifetimes (x,y) and (x,2y) are treated 
differently. Nevertheless, the separate computation of f and v ignores the dependency 
between the failure times in the two scales. Policy Mxy is based only on estimates of the 
marginal distributions of failure time in the two scales, and thus does not fully account 
for the joint effect of age and usage on failure. A bivariate policy should somehow 
account for this dependence. Kordonsky and Gertsbakh (1995) explain, “Each particular 
time scale reflects indirectly a most relevant process of damage accumulation, but fails to 
reflect the joint, interactive action of these processes. For an aircraft... ‘time in the air’ 
and ‘number of flights’ both reflect fatigue damage accumulation, but each scale 
separately is not able to reflect ‘total’ fatigue damage.” In Chapter VI, we develop a cost 
function that can be used to find the “best” policy of the form Mxy- 
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Case Study 3 


Consider failures due to metal fatigue, (see Appendix B) discussed in Kordonsky 
and Gertsbakh (1993). The metal fatigue data plotted in Figure 2.3 consists of 30 
observations, five on each of six distinct paths. Specimens on a particular path are 
subjected to bending through a repetitive pattern of a fixed number of small-amplitude 
(low-load) cycles followed by a fixed number of large-amplitude cycles (high-load) until 
failure. In Figure 2.3, the scale along the horizontal axis is the number of low-load cycles 
and the scale along the vertical axis is the number of high-load cycles. By design, the 
observations fall almost perfectly on lines of slopes 6\ = 0.053, &i = 0.250, (h - 0.667, 

0 4 =1.5, 0 5 = 4, and 0 6 = 19. The dashed lines in Figure 2.3 represent these approximate 
linear usage paths. 



0 10000 20000 30000 40000 

low/10 


Figure 2.3: Metal Data with Approximate Linear Usage Paths. 

Each triangle represents the number of low-load and high-load cycles until 
a failure occurred, scaled by a factor of 1/10. 







r 


In data sets of this form, each device ages along a linear path of slope &„ 
i = 1,..., m, where 0 < 6\ < < ... < 9 m < «. As such, the data set can be naturally 

partitioned into m samples, each consisting of failure data along a linear path. As with 
the traction motors, a policy can be specified for devices along a given usage path solely 
in terms of chronological time, since at any time x > 0 the position of a device along its 
usage path is known. To construct such a policy, consider the sample along each usage 
path separately. That is, use the n, chronological ages at failure along the i th path to 
estimate F„ the conditional lifetime distribution of X \0= 6i. Then, use the empirical cost 
function (1.3) to estimate the optimal age replacement policy r, (which applies only to 
devices on the i ,h path). The resulting policy, with replacement times summarized in 
vector (t,, r 2 takes the following form: replace a non-failed device on path i 

when its chronological age reaches r i , i = 1 ,... , m. 

For the metal data, suppose each F, is estimated with the empirical distribution, 
placing mass 1/n, = 0.2 on each observation on the i th path. Upon doing so, for KIC = 0.5 
we obtain the following estimates: r, =23580, r 2 =10300, t 3 =5700, f 4 =3200, 
r 5 =1000, r 6 = 275. Hence, the “composite” policy is as follows: replace non-failed 

devices on path 1 at age 23580;... ; replace non-failed devices on path 6 at age 275. The 
region corresponding to this policy is depicted on the left side of Figure 2.4. 

At first glance, the proposed composite policy seems reasonable; however, the 
implementation of the policy is problematic. Consider two devices, say A and B. 
Suppose A has usage path 5, namely {(x,4x), *>0} and B has usage path 4, namely 
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{(jc, 1.5jc), x> 0}. Under the composite policy, if device A is still operating we would 
replace it preventively when its age reaches x = 1000; at this time, it has usage 
y(X) = 4000. However, if device B is still operating at x = 3000, we would not replace it; 
at this time, its usage is y(x) = 4500. The metal fatigue experiment was designed so that 
the accumulation of low-load cycles and the accumulation of high-load cycles are the 
only factors leading to device failure. As such, this composite policy does not seem 
sensible, because device B is older than device A in every respect. 




Figure 2.4: Composite Policies for the Metal Data. 

The solid lines on left side of the figure represent the failure replacement 
region for the policy with replacement time vector (23580, 10300, 5700, 
3200, 1000, 275). The right side of the figure depicts the failure 
replacement region for the policy with replacement time vector (10000, 
10300, 5700, 3200,1200, 275). 


However, this is not the only problem we could encounter using this approach. 
Consider the same data, and suppose that instead of the policy suggested above, we 
obtain policy (10000,10300, 5700, 3200, 1200, 275). The region corresponding to this 
policy is depicted on the right side of Figure 2.4. Suppose device A is on path 1 and 


18 





device B is on path 2. Under this policy, if device A is still operating at age x = 10000, 
we would replace it preventively; at this time its cumulative usage is y(x ) = 526. 

However, if device B is still operating at age x = 10000, we would not replace it (as it has 
not yet reached age x = 10300); at age x = 10000 its cumulative usage is y(x) = 2500. 
Device B is older than A in every respect; this composite policy does not seem sensible 
either. We now investigate the notion of a “sensible” policy in more detail. 

B. DESCRIPTION OF A CLASS OF MULTIPLE-SCALE POLICIES 

In this section we describe a class of multiple-scale policies which generalizes the 
class of single-scale policies {(0 ,t): t> 0}. We assume devices under consideration may 
differ only in their age in chronological time and in the amount of usage accumulated. 

As such, we implicitly assume there are no “hidden” covariates (e.g., better 
environmental conditions for certain devices, or additional measures of usage) affecting 
the process leading to eventual device failure. One example of a policy which 
generalizes the policy (0,r) is M= (0 ,u) x (0,v), where u > 0 and v > 0, as considered in 
Case Study 2. Under this policy, a device is replaced if it fails at a time (X,Y) where 
X < u and Y < v or when its usage path crosses the boundary x = u or y = v, whichever 
occurs first. Kordonsky and Gertsbakh (1994) devise policies based on lifetimes in two 
scales by projecting failure times onto a single time scale of the form t = (l-a)x + ay{x), 
in which they define a replacement age r a . This policy replaces at age t - t a or upon 
failure, whichever occurs first. In the original two scales, this policy corresponds to the 
region M = {(x,y(x)): (1 -a)x + ay(x) < T a }. In fact, for a = 0, M = Mx, as in Case 1; 
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similarly, for a = 1 , M = My; when 0 <a< 1, Mis a right triangle with right angle at the 
origin. 

On the other hand, consider the policy M depicted in Figure 2.5. From a 
preventive maintenance standpoint, this policy is not sensible since the device with 
(x,y(x)) = (50,25) would be replaced preventively, but a non-failed device with 
(x,y(x)) = (55,90) would not be replaced, even though it is older than the first device in 
both time scales. In order to be sensible under the assumptions described above, a policy 
prescribing preventive replacement of a device should prescribe preventive replacement 
of any “older” device. On the other hand, if a policy stipulates that a device should not 
be replaced preventively, then any “younger” device should not be replaced preventively 
either. To describe this more formally, we need a means of ordering two-dimensional 
failure times. 
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Figure 2.5: Undesirable Policy. 

Under this policy, for example, a non-failed aircraft component with x= 50 
flight hours and y{x) = 25 landings would be replaced, but one with x = 55 
flight hours and y(x) = 90 landings would not be replaced. 


A binary relation -!ona set Xis a simple order on Xif it is reflexive, transitive, 
anti-symmetric, and the members of every pair of elements of Xare comparable. The 
relation -< is a partial order on a set Xif it is reflexive, transitive, and anti-symmetric 
(thus, simple orders are partial orders; however, for partial orders, certain elements of X 
may be non-comparable). In addition, L c Xis a lower set with respect to a partial order 
-< if u e L, v e Xand v <u imply v e L (e.g., Robertson, Wright, and Dykstra, 1988); a 
lower set contains all “predecessors” of each of its members. For failure times u = ( U\,U 2 ) 
and v = (vj, V 2 ) in JTe (0,c«) 2 , we take -< to be the matrix partial order where u ■< v if and 
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only if mi < vi and u 2 < v 2 . Note that JTmay be a proper subset of (0,°°) 2 , as in Case Study 
3, where all failure times lie along one of six linear usage paths. 

Using these definitions, we now characterize a class of policies for the multiple- 
scale age replacement problem. For ease of exposition, they are described in the plane. 
Let X denote the support of (X,Y), and ^ denote the class of all open lower sets with 
respect to the matrix partial order on X. Observe that for X = (0,«>), the class of single¬ 
scale policies {( 0 , t ): t> 0} is the class of open lower sets with respect to the simple 
order < on (0,°°). Thus, M x is a natural generalization of the class of single-scale policies. 
In addition, members of M x are “sensible” policies from the standpoint of 
implementation when failure characteristics are captured by the two time scales. In the 
literature, Murthy et al (1995) use rectangular, triangular, other planar regions as 
warranty policies; every region they consider is a lower set with respect to the matrix 
partial order on (0,°°) 2 . Similarly, the policies developed in Case Studies 1 and 2 above 
are members of M x , but the policies described in Case Study 3 are not. For ages 
measured in k > 2 scales, the notation is easily extended so that M x is the class of open 
lower sets in Jc (0,°°)* with respect to the matrix partial order generalized to (0,°°)*. 

C. NESTED POLICIES 

Let r = KIC denote the ratio of the preventive replacement cost and the additional 
cost to replace a device due to failure. As r decreases, it becomes proportionally more 
costly to replace at failure, so the replacement age based on a single scale should be more 
conservative. To show this, we make explicit the dependence on cost ratio r and define 
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D(r;r) = C(r)/C 



( 2 . 1 ) 


where C(t) is the cost function in (1.1). Let v= inf{x: S(x ) = 0}; v< °°. Then, for cost 
ratio s < r, 


D(t; r) - D(r ;s) = - -— 

J o S(u)du 


( 2 . 2 ) 


is a positive, continuous, and strictly-decreasing function of T on (0, v). Suppose D(f,s), 
and hence C(t) with cost ratio 5 , attains a minimum at ^(s); there may be several 
minima. It can be shown that 1*(s) < V. For T < 7*(s), 

D(T;r)-D(i*(s);r) = [D(T;r)-D(r;s)] + [D(t;s) - D(i*(s);s)] + 

[D(r*=(j);5)-D(^(5);r)]. (2.3) 

Since 7*(s) minimizes D(r;s), the second term on the right-hand side of (2.3) is non¬ 
negative; in addition, because (2.2) is strictly decreasing on (0, v), the sum of the first and 
third terms is positive. Thus, D(r;r) > D(z*(s);r) Vr< T*(s) , and it follows that C(f) 
with cost ratio r can only attain a minimum for T > T*(s). 

Hence, for a decreasing sequence of cost ratios r\, r 2 ,... , the corresponding 
single-scale policies are nested. That is, if the corresponding optimal replacement times 
are, respectively, T\, ri,..., then we know ty> r 2 > ... , so that the policies 
(0,fj) 3 (0 ,3 ... form a sequence of nested lower sets. 
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Multiple-scale age replacement policies should also be more conservative as r 
decreases; in particular, policies for smaller r should be subsets of those for larger r. Let 
X= (0,o=) 2 . Consider the policies based on region M\ = {(x,y(x)): x + y(x) < 6, x > 0} for 
n = 1 and M 2 = {(x,y(.x)): 5x + y(x) < 15, x > 0} for r 2 = 0.5. Note both M\ and M 2 are in 
M x . Now, consider a device with linear usage path y(x) = 5x; the policies and usage path 
are depicted in Figure 2.6. This example illustrates that non-nested multiple-scale policies 
can prescribe replacement times that are not sensible. With r\ = 1, the additional cost to 
replace a device due to failure is equal to the preventive replacement cost, while r 2 = 0.5 
means the additional cost to replace a device due to failure is twice the preventive 
replacement cost. Thus, it seems policy M 2 should hedge against this higher failure 
replacement cost and suggest replacement at an earlier time than the time suggested by 
policy M\. 
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Figure 2.6: Non-nested Policies. 

Solid lines represent boundaries of policies M^ and Mi and the dashed line 
represents a linear usage path of slope 5. Under policy Mu non-failed 
devices on this path are replaced when x=1; under policy M 2 , non-failed 
devices on this path are replaced when x- 1.5. 
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III. POLICIES BASED ON COMBINED SCALES 


Due to the complexity of modeling lifetimes in multiple scales, much of the recent 
work in this area focuses on finding appropriate methods for combining scales to form a 
single time scale. Once such a combined scale is found, reliability tools such as age 
replacement theory can be brought to bear. We begin with a general discussion of 
combined scales. We then consider in detail three combined time scales in the literature 
that seem best suited for age replacement policies given failure data in two scales, age 
and usage. The first, and in a sense closest in spirit to our efforts, is the work of 
Kordonsky and Gertsbakh (1994) in which a combined scale is found for age 
replacement. The next two scales discussed are the “minimum CV” scale of Kordonsky 
and Gertsbakh (1993, 1995, 1997) and the “ideal” time scale of Duchesne and Lawless 
(2000). Both of these time scales are based solely on the underlying failure models and 
are developed independently of the age replacement problem. However, Gertsbakh and 
Kordonsky (1997) do suggest a context in which their min CV scale is “optimal” for 
preventive maintenance and Duchesne (1999) suggests his scales might be useful for 
maintenance planning. 

A. COMBINED TIME SCALES 

A formal definition of “time scale” is given by Duchesne and Lawless (2000). 

Let the set of all device usage paths Z(x ) be Z(x). For a particular device, let the “whole” 
usage path be Z = Z(°°); let the set of all such paths be Z = Z(°°). A time scale <£(*:,Z(x)) 
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is a non-negative real-valued functional of x and the path Z up to age x; it is required to 
be non-decreasing in x for all Z in Z. Hence, a time scale is a function of chronological 
time and external covariates. Recent research efforts focus on finding a time scale for 
which t z (x) = <£(x,Z(x)) suffices for the calculation of probabilities for failures modeled in 
two scales. Oakes (1995) introduces the notion of the “collapsibility” of two time scales 
into one time scale which is “fully informative” in the sense that the probability of 
survival to a specified point (in the plane) depends only on the location of the point, not 
on the path taken to get to the point. Specifically, following Duchesne and Lawless 
(2000), the distribution of XIZ is “collapsible in y(x)” if the survival probability at time x 
depends only on the path Z up to x only through its endpoint (x,y(x)). Thus, a time scale 
for a collapsible model can be written as t z (x) = <b(x,y(x)). Collapsible models are 
common in the literature since in many cases X and Y = y(X) are observable but the 
history Z(X) is unknown. If the usage path is approximated by a straight line, the 
resulting models are collapsible since, y(x) = 6x and hence the path Z is known by its 
value y(x) at any time x. 

To illustrate the consequences of combining time scales in a collapsible model, 
consider the time scale t = x + gy(x) for some g > 0. Note t induces a family of contours 
{y = (t-x)/g, t e (0,oo)}, as depicted in Figure 3.1 (Duchesne, 1999). The points where 
the usage paths intersect a given dotted contour line all have the same age (in the 
combined scale). 
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Figure 3.1: Contours of Linear Scale in a Collapsible Model. 

Jagged lines represent device usage paths and dashed lines represent 
contours of a linear time scale. Reproduced from Duchesne (1999). 

This family of contours provides a means to compare points on different usage 
paths that may be non-comparable with respect to matrix partial order. Consider the 
points of intersection of contour t = 4 with the four usage paths in Figure 3.1. The matrix 
partial order does not enable us to determine the relative ‘ age of devices having age and 
usage represented by these points. On the other hand, the four points have the same age 
in scale t. Thus, the combined scale t induces an ordering (by age in this scale) of a set of 
points (xj,y(xi)), (x 2 ,y(* 2 )), ••• » (,x„,y(x„)). In addition, as illustrated by the contours, the 
scale provides a means of specifying the relative age of one device in relation to another. 

Different time scales order and “space” a given set of lifetimes differently. To 
illustrate this, consider Figure 3.2. Figure 3.2 contains a scatterplot of labeled points 
(x u yi), (jc 2 ,y 2 ),... , (xio,yio), randomly generated from the unit square; lines of slope -1 
correspond to contours of scale t = x + y(x) and lines of slope -10 correspond to contours 
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of scale s = x + 0.1y(x). Table 3.1 lists the coordinates of the points, their “age” in the 
two scales, and their ranks r(t) and r(s) in the two scales t and s. 



x 


Figure 3.2: Contours of Linear Scales f and s. 

Labeled points are randomly generated from the unit square. Lines of 
slope -10 and -1 are contours of linear time scales s and t, respectively. 
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Table 3.1: The “Action” of Two Different Time Scales. 

This table summarizes some of the information in Figure 3.2. Row 7 
indicates (x 7 ,y 7 ) has age 0.84 in scale t, age 0.65 in scale s, is the fourth 
“youngest” point in scale t, and is the eighth “youngest” in scale s. 
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Using the combined scale, an age replacement policy can be expressed as (0 ,t). 

In this form, a policy may have limited utility to the practitioner. On the other hand, a 
graphical depiction of this policy in terms of the original scales age and usage is very 
useful. In the original scales, the /-scale policy (0 ,t) is equivalent to 
M - {(x,y(x)): 0(x,y(x)) < t}. For example, policy (0,0.4) in scale t above “translates to 
the policy { (x,y(x))'. x + y(x) < 0.4} in Figure 3.2. In fact, the policy Mx discussed in 
Chapter II is a special case of such a “translation”; in this case, the combined scale is 
simply Jt. For most combined scales found in practice, an age replacement policy in the 
combined scale corresponds to a lower set in the original scales. This is only the case, 
however, when the combined time scale O is such that for (*i,yi(xi)) and (x 2 ,y 2 (x 2 )) where 
x\ < X 2 and yi(*i) ^ y 2 fe) we have <E>(xi,yi(xi)) ^ ^feo^fe))- 1° ot h er words, since time 
scales are by definition required only to be increasing in x for any Z, it is possible to 
display combined scales for which the policy in the original scales is not a lower set. 


B. A COMBINED SCALE FOR AGE REPLACEMENT 

Kordonsky and Gertsbakh (1994) find the “best” scale for age replacement among 
the family of scales that are convex combinations of the two scales of age and usage. 
They consider the family of scales {/(a) = (1- a)x + ay(x), a e [0,1]}; in scale t(a) the 
lifetime is T{a) = (1- a)X + aY. The geometric interpretation of times in scale t(a ) is 
insightful. Time t{a) = (1 -a)x + ay(x) is proportional to the length of the orthogonal 
projection of the point (x, y(x)) onto vector (1- a,a ); the search for the “best scale is 
essentially a search for the “best” such vector onto which to project the data. 
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For a fixed a, let F a (t) = P(T(a ) < t), and define 


CM- / +CF - (r) ■ *><>. 


JJ (! - 


(3.1) 


Thus, C a (T) is identical to the long-run average cost function (1.1). To find the “best” a, 
it seems reasonable to find, for a given a, the optimal replacement time in this scale (say 
T a ), and then search [0,1] for the a yielding minimal C a (T a ). However, C a (T a ) has 
dimension cost per unit of time in the scale t(a). Thus, values of C a (T a ) must be 
“converted” to make them comparable. To this end, Kordonsky and Gertsbakh convert 
(3.1) into a cost function with dimension cost per unit of chronological time in the 
following way. Because the average lifetime in scale t(a) is E[T(a)} and the average 
lifetime in chronological time x is E[X\, then from a damage accumulation perspective 
one unit of “r(a)-time” is equivalent to E[X\ /E[T{a)] units of x-time. Hence, the 
“converted” cost function is 

D a (r) = C a (T)E[T(a)]/E[X\, r> 0. (3.2) 

Let r a = argmin D a (t). By definition, the “best” scale corresponds to the a* which yields 
the minimum value of D a (T a ). 

Kordonsky and Gertsbakh estimate a* nonparametrically based on a simple 
random sample (Xi,T t ), (X 2 ,Y 2 ),..., (X n ,Y„). Care needs to be taken in applying their 
method, however. Consider the auto data set, taken from Wilson (1993), and which can 
be found in Appendix B. The boundaries of the policies for cost ratios r = 0.5, 0.25, and 
0.125 are depicted in Figure 3.3. The policies are lower sets, but they exhibit the non¬ 
nested behavior exhibited in Figure 2.6. This suggests that the “best” scale is a function 
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of the cost ratio. For the metal data, however, the policies are nested for { r: 0 < r < 1}. 
We suspect the non-nestedness of the policies derived from the auto data may be caused, 
in part, by the lack of sufficient spread in the distribution of usage paths. As such, it can 
be argued that non-nestedness is exhibited here since most observations in the auto data 
set fall roughly along a single regression line fit through the origin (unlike the metal 
data). 



Figure 3.3: Non-nested Policies for Auto Data Based on “Best Scale” Method. 
Triangles represent the number of days and miles until a failure occurred. 
Labeled lines are policy boundaries for cost ratios r= 0.5, 0.25, and 0.125. 


C. POLICIES BASED ON MINIMUM CV SCALE 

We now examine another combined scale on which policies can be based. 
Consider again the family of linear scales T a = {t{a) = (1- a)x + ay(x), a e [0,1]}. Let 
CV[r(a)] denote the coefficient of variation of the lifetime in scale t(a); Kordonsky and 
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Gertsbakh (1993, 1995, 1997) identify the scale having a * minimizing CV[7(a)]. They 
prove the (unrestricted) minimizer of CV 2 [T(a)] has a* = g*/(l+g*), where 


£[7]yflr[X]-£[X]Cov(X,y) 

E[X]Var[Y]~ E[Y]Cov(X ,Y) ' 


Since the family of scales specifies a e [0,1], it is important to describe the cases 
leading to a*£ [0,1]. In fact, from (3.3) we can show that a* £ [0,1] iff either Case A or 
CaseB holds in (3.4): 


CaseA\CV\X)< 
Case B: CV 2 (T) < • 


Cov(X,T) < cv2 
E[X]E[Y] 

^£X^LiI1 < cv 2 (x). 

E[X]E[Y] 


(3.4) 


In practice, an estimate a * of a* is obtained by replacing each of the terms in (3.3) with 
its sample estimate; Cases A and B are modified accordingly. Duchesne and Lawless 
(2000) note that when Case A holds, the minimizer of CV 2 [T(a)] in T a has a* = 0, so that 
t = x is the min CV scale. When Case B holds, a* = 1, t = y(x) is the min CV scale. 

Consider using the min CV scale to construct a multiple-scale age replacement 
policy based on a simple random sample (Xi,Yi), (X 2 ,Y 2 ),..., (X n ,Y n ). If the sample 
version of Case A holds, the policy is M x (as in Case Study 1 of Chapter II). This means 
that if we use “min CV” as the criterion for time scale selection, it suffices to base the 
policy solely on the distribution of chronological time at failure. Similarly, if Case B 
holds, it suffices to base the policy solely on the distribution of cumulative usage at 
failure. Gertsbakh and Kordonsky (1997) note that if Co v(X,Y) < 0, then a* e [0,1], so 
neither Case A nor Case B can occur. In this “more interesting” situation, we often find 
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0 < a* < 1 (we note it is possible for a* to be 0 or 1 if Cov(X,F) < 0). In this case, 
policies for a decreasing sequence of ratios r form a sequence of nested right triangles. 
For example, consider the metal data set. From the sample version of (3.3) we find 
a* = 0.871, so the min CV scale is t = 0.129* + 0.87 ly(x). Using (1.3) in this scale we 
find the replacement time for 0.7 < r < 1 is 3984; for 0.594 < r < 0.7 the replacement time 
is 3801; and for r < 0.594 the replacement time is 3396. These replacement times induce 
the set of nested right triangles depicted in Figure 3.4. 



Figure 3.4: Nested Policies for Metal Data. 

Dashed lines represent policy boundaries, based on the min CV scale. The 
policy for r< 0.594 is nested within the policy for 0.594 < r< 0.7, which is in 
turn nested within the policy for 0.7 <r< 1. 
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D. POLICIES BASED ON IDEAL TIME SCALE 


The long-run average cost C(t) of a single-scale age replacement policy (0 ,t) is 
given in (1.1); z* minimizes this expression. Using the transformation p - F(r), with 
F 1 (p)=sup{x: F(x) <p}, equation (1.1) can be rewritten as 


C(p) = 


K + Cp 



, 0<p<\. 


(3.5) 


Solving for p* to minimize C(p) in (3.5) and for -r* in (1.1) are identical problems; the 
total time on test approach to solving the age replacement problem is based on this 
transformation. Thus, r* is the p*-quantile of the lifetime distribution F. This latter 
formulation of the age replacement problem is insightful since it indicates that, under the 
policy, a device has probability p* of failure before replacement. Thus, a “natural” 
generalization of policy (0, T) is a multiple-scale policy for which the probability of 
failure before replacement is the same (say p) regardless of the path. With broader 
applications in mind, Duchesne and Lawless (2000) introduce an “ideal” time scale (ITS) 
which might be used to find such a policy. 

Duchesne and Lawless (2000) motivate their definition of an ITS as follows. If a 
single-scale t z (x) = 0(x,Z(x)) suffices for the calculation of failure probabilities, then the 
distribution of T = <I>(X,Z(X)) along each Z should be independent of Z. That is, 

P[T > 1 1Z] = P[T > t\ = G{t), and G( ) does not depend upon Z. In addition, t z (x) must 
change whenever the conditional survivor functions S 0 (x, Z(x)) = P[X >x\Z] change. 
Duchesne and Lawless define t z (x) = <J>(x,Z(x)) to be an ideal time scale if it is a one-to- 
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one function of S 0 (x, Z(x)). In this case, P[X >x\Z] = G[f z (x)] = P[T > t z (x)]. Duchesne 
(1999) explains, “an ITS is a time scale in which we can directly compare the lifetimes of 
all the devices under study, no matter what their usage patterns are ... it is ‘ideal’ in the 
sense that the age in the ITS is the only information needed to compute P[X > x \ Z], so it 
is ‘sufficient’ for computing the age of the units.” 

In fact, Duchesne (1999) mentions maintenance and inspection policies as 
potential applications of his ITS concept, and gives the following example. Suppose we 
want to inspect devices when their probability of failure is 0.25, regardless of the path. 
Suppose t = x + 5y(x) is an ITS; let T denote the lifetime of a device in scale t and t . 25 
denote the 25 th percentile of that lifetime distribution. If / 25 = 100, devices should be 
inspected whenever x + 5yfr) = 100. Duchesne (1999) notes that ITSs are, by definition, 
unique up to one-to-one transformations. Hence, if t defines an ITS and \\f is a strictly 
increasing continuous function with \|i(0) = 0 and \|/(°°) = °°, then, u = \|/(0 is also an ITS. 
Thus, for example, u = t 2 = {x + 5y(x)) 2 is also an ITS; let U denote lifetime in this scale. 
Since Pr (U < 100 2 ) = Pr(T < 100) = 0.25, we have u. 25 = 100 2 . Thus, devices should be 
replaced whenever (x + 5y(x)) 2 = 100 2 , which is identical to the policy based upon scale t 
as defined above. This is a simple consequence of the monotone transformation. 

Similarly, it seems we should be able to obtain a path-independent age 
replacement policy by finding the policy in any ITS and transforming this interval to a 
region in the positive quadrant (as described in section A above). There is a problem, 
however, stemming from the non-uniqueness of the ITS. Suppose T has an exponential 
distribution. It is well known that the optimal replacement time is infinite, so the policy 


37 



in this scale would be to replace only at failure. The r-scale policy (0,°°) translates to the 
entire positive quadrant. Now, consider the policy based on scale u = t m : U would then 
have a Weibull distribution, and the policy in scale u would be (0,v) for some v < 
Translating to the plane results in the region {(x,y(x)): (x + 5y(x)) l/2 < v} which differs 
from the policy based on scale t. 

To illustrate this, consider the metal fatigue data discussed in Case Study 3 of 
Chapter II. Duchesne and Lawless (2000) show that scale t = x + 6.1 y(x) is a reasonable 
approximation to the true, unknown ITS. Let T denote the lifetime in this scale; we first 
“reduce” each pair (x, y(x)) to scale t . Then, upon estimating Fj{t) = P(T < t) with the 
empirical distribution, we find that for r = 0.5, the minimizer of (1.3) is f = 26125. The 
ITS interval (0,26125) corresponds to the region Mr = {(x,y(x)): x + 6.7 y{x) < 26125}. 
The boundary of this policy is the solid line in Figure 3.5. Under this policy, we replace 
the device upon failure or when the sum of its accumulated low cycles and 6.7 times its 
accumulated high cycles reaches 26125. 
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Figure 3.5: Policies Based on Ideal Scales t and u. 

The solid line represents the policy boundary for r = 0.5 based on scale t 
and the dashed line represents the policy boundary for r = 0.5 based on 
scale u. 


We now construct the age replacement policy for this data using another ITS. If 
t = x + 6Jy(x) is ideal for the metal data, then the monotone transformation u = t 2 is also 
ideal. Proceeding as above, upon calculating the failure times U we find the minimizer of 
equation (1.3) is v = 40760 2 . In the plane, the ITS interval (0,40760 2 ) corresponds to the 
region Mu = {(*, y(x)): x + 6.7 y(x) < 40760}. The boundary of this region is the dashed 
line in Figure 3.5. Observe Mu is not the same as Mr, the region derived from the first 
ideal scale. 

In summary, path-independent, fixed-probability-of-failure inspection policies can 
be based on an ITS, but basing an age replacement policy on an ITS can pose significant 
problems. The reason ideal scales pose problems for age replacement but not fixed- 
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probability inspection policies relates to our discussion of the ordering and “spacing” 
action of combined scales. An ITS 4>, like other combined scales, orders and induces 
spacings between the failure times. A monotone function of <E> maintains the ordering 
of the times given by ITS O, but the spacings change. This fundamentally changes the 
nature of the failure distribution on which the optimal age replacement policy depends. 
(An obvious exception is when \(/ is linear; see Lemma A. 1 in Appendix A.) More 
specifically, let T and U denote the lifetimes in scales 0> and \|/(0), respectively; let 1* 
and v* denote optimal replacement times in these scales. The observation above is that 
although U = \\r(T), it is not necessarily true that v* = \|/(?*). This is due to the fact that in 
transforming the cost function (1.1) from scale t to scale u, the numerator remains 
constant but the denominator changes. 

E. DISCUSSION AND SUMMARY 

In this chapter we have discussed how a multiple-scale age replacement policy 
might be obtained if scales age and usage are combined in various ways. One method of 
Kordonsky and Gertsbakh (1994) is motivated from the standpoint of cost. For a fixed 
r > 0, this method finds the “best” vector (1- a, a) on which to project the data based on a 
“converted” cost function; the resulting policy M r is triangular (or possibly of form Mx or 
My). However, we note for s < r the method is not guaranteed to have M s a M r \ this is 
because the “best” scale depends on the cost ratio. For a fixed r > 0, policies based on 
the min CV scale are triangular (or possibly Mx or My) and since minimizing CV results 
in a vector (1- a,a) independent of r, the policies for a decreasing sequence of cost ratios 
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are nested. Finally, we note that if, based on the failure data, a reasonable estimate of the 
ITS can be found, a policy in this scale has the property of fixed probability of failure 
before replacement, regardless of the path. While this property is attractive, we note 
monotone transformations of the ITS are also ideal, but do not necessarily result in the 
same policy as in the original ITS. 

Combining scales is convenient in that it allows analysis to proceed along one 
scale. There is a drawback to the combining of scales, however. Kordonsky and 
Gertsbakh (1995) explain how damages in the different time scales can interact: in 
aviation, corrosion (as reflected by the time scale “calendar time”) affects both fatigue 
damages due to the amount of time in level flight (as reflected by the time scale flight 
hours”) and the high-amplitude stresses incurred during the takeoff and landing cycle (as 
reflected by the time scale “number of landings”). As such, they observe “No single time 
scale is sufficient for a complete description of all wear and damage accumulation 
leading to failure in one of the aircraft parts.” Thus, useful information may be lost even 
if the “best” single time scale is used (i.e., the one which best accounts for the damage 
accumulation processes and their interaction); for this reason, we proceed to the 
introduction of new methods which do not combine the scales. 
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IV. POLICIES GIVEN DATA ALONG SEVERAL LINEAR PATHS 


In this chapter we generalize the single-scale failure replacement interval (0 ,t) to 
the multiple-scale setting in which failure data fall along several linear paths. Such 
situations often arise in modeling real-world observational lifetime data in multiple scales 
(e.g., Gertsbakh and Kordonsky, 1998 and Lawless et al., 1995). In many cases X and Y 
are known but the usage curve Z is unknown and is approximated by a straight line. 

Linear usage paths may also arise by cyclic usage in fatigue life experiments (as 
exemplified in the metal data). The development is as follows. First, we establish 
notation to be used throughout the chapter. In so doing, we describe the cost function 
used to define an “optimal” policy in this setting. Next, we explain how to estimate the 
optimal policy for given failure data, and present an example. We then compare our 
approach to the methods found in the literature, and summarize. 

A. “COMPOSITE” POLICIES 

Consider a population of devices differing only in their rates of use, which 
remains constant throughout their lifetimes. Thus, suppose that upon entering service, a 
device is assigned a linear path Z, (characterized by its slope 6,) with probability p„ 
i= 1,..., m. Suppose also that 0 < Q\ < (h < ... < 6 m < Let F, be the distribution of 
lifetime X (in chronological time) given 0- &i, i= 1,..., m; as in Chapter I, F,(x) = P(X < 
x\6= 0i). From (1.1) the long-run average cost per unit time for a device operating with 
6= 6i under policy (0,tj) is 
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Ci(ti)= Ti>0,i= 1,... ,m. (4.1) 

l Si{u)du 

Let Tj* be an optimal age replacement time for devices on path i, i = 1,..., m; that is, 

Ti* = argmin To form a composite policy from the path-specific policies (0,r;*) for 

/ = 1,... , m, let Mt* - { (x, 8 ix): 0 < x < Ti*, i = 1,... , m). This composite policy has 
replacement times summarized by the vector (Ti*, T 2 *,..., T m *), meaning devices on 
path Z, are replaced upon failure or when their age reaches Ti* (whichever occurs first), 
i = 1,..., m. As in Case Study 1 of Chapter II, since at any given chronological time 
x > 0, the position of a device along its usage path is known, we can specify the 
replacement times solely in terms of chronological time. 

In Case Study 3 of Chapter II, for the metal data, estimation of replacement times 
for such a composite policy did not result in a sensible policy. More specifically, with 
0 = {0.053,0.250, 0.667,1.5,4, 19} and X= {(*,$*): Oct, 0, in 0, i = 1,... ,6}, the 
composite policy with replacement times (23580, 10300, 5700, 3200, 1000, 275) does not 
correspond to a region which is a lower set in M x - We now give conditions on a 
replacement time vector (Ti, Ti,, t m ) that ensure M t is a lower set. 

Proposition 4.1. A composite policy M r = {ft, 0jx): 0 < x < Ti, i = 1,..., m } for 
devices on linear usage paths where 0 < 6 \ < 61 < ... < 0 m is a lower set with respect to 
the matrix partial order on X= {{x, 6 ix): 0 < x, i = 1,... , m) if and only if both Tm < Ti 
and 0 i+iTi+\ > 0 Ti, i=l,..., m- 1. 
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Proof: Starting with the reverse statement, let jc € M r and let jy e JTsuch that 
y -<x. To show M T is a lower set with respect to the matrix partial order on X, it suffices 
to show y e M x . Because x e M r , the age x = ( t , Ojt ) for some 0 <t< Tj and some 8j in 0. 
Similarly, because y e X,y = ( s,8 k s ) for some s > 0 and some 6k in 0. Because y -< x, it 
follows that s < t and 8kS < djt. It suffices to show 0 <s <Xk. First, treat the case k< j. 
Because s < t and Tj < T k , we have 0 <s<t < Tj< Tk. On the other hand, if k > j, then 
because 9 k s < djt and 9jXj < O k Tk, we have 0 < s < ( 8/0k)t < (8/6k)fj < tk. Thus, the policy 
is a lower set. 

Turning to the direct statement, suppose M r is a lower set; let i e {1,..., m- I}. 
Suppose further that Tm > T,-. Let x = (T l+ \ + ri)/ 2; consider u = (x,8i+[X) e M r and 
v = (jc , 9 ix)e X. Note that v -< u, but because x > T t , v &M t . This contradicts the fact that 
M r is a lower set. Thus, Tj+i < % Similarly, suppose OmTm < Oft. Lety = 

(Q+itj+i + $iTj)/2, x = y/6j and z = y/8+ \. Consider u = (x,y) e M r and v = (z, y)e X. Note 
that v -<«, but because z > t/+i, v gM r , contradicting the fact that M T is a lower set. Thus 

9i+\Ti+\ ^ OiTj. 


This proposition reveals the problems encountered in Case Study 3 of Chapter II. 
The policy with T= (23580, 10300, 5700, 3200, 1000, 275) has 85 X 5 < 84 X 4 ; in order for 
M r to be in M x we need 85X5 > 84X4 (all other requirements of the proposition are 
satisfied). Similarly, the policy with x= (10000, 10300, 5700, 3200, 1200,275) has 
Xi> Tj; for Mr to be in M x we need Ti<T\. Thus, for the metal data, the hypothetical 
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policies we considered in Case Study 3 are not lower sets. Several ad hoc methods can 
be used to transform these policies into members of M x . For one, a linear interpolation 
can be used to “smooth” sequential members of T which violate either of the conditions 
T i+ 1 < T t or $j+\ ti+\ > diXi. Another alternative is to use a pooling scheme (as is done in 
isotonic regression, ref. Robertson, Wright, and Dykstra 1988) to transform the policy. 
However, neither of these schemes takes into account the cost of implementing the 
resulting policy. Since it is desirable to obtain a sensible policy which is optimal with 
respect to some cost function, we now introduce such a cost function. 

B. THE COST OF A COMPOSITE POLICY 

The first policy of Case Study 3 of Chapter II is “optimal” in the sense that it 
minimizes the (estimated) long-run average cost per unit of time in use for devices on 
each path i= 1 Unfortunately, the policy is not sensible from the standpoint of 

implementation. We need a means of obtaining a policy that is “optimal” in a sense 
which accounts for costs along each path, but is simultaneously “sensible.” An equitable 
method of calculating the cost of policy M r with corresponding replacement time vector 
T= (Ti, T 2 ,..., T m ) is to form the average, weighted by the assigned probabilities, of the 
costs of the path-specific policies: let 

m 

C(r) =5>,C,.(T,), Tj>0, i= 1,... ,m. (4.2) 

1=1 

A cost function of this form is studied by Gertsbakh and Kordonsky (1997) as they 
address the “optimal” time scale for maintenance in heterogeneous environments. Here 



C(r) represents the expected long-run average cost per unit of time in use of maintaining 
a device under a policy corresponding to its operating conditions. The dimension of C(r) 
is in units of cost per unit of (chronological) time in use. If it is more meaningful to the 
decision maker, equation (4.2) can be easily transformed so it has dimension units of cost 
per unit of time in use in the second scale. 

From Proposition 4.1, we note that in order for a policy M r with replacement time 
vector T= (Tj, Tj, Tm) to be in M x , rmust lie in the set A, defined by 

A = { r g (0,°o) m : - > T , > t 2 > ... >r m >0, 0,r, < £>r 2 < ... < 9 m r m }. (4.3) 

Thus, to find the optimal “sensible” policy for a given r > 0, one must minimize (4.2) 
subject to the restriction that Tis in A. Let 7* denote this minimizer. 

For a given r > 0, if a collection of conditional distributions {F,} has 
(Ti*, T2*,...,T m *) 6 A, then by the optimality of each n* it follows that 
T* = (Ti*, T2*,.. .,T m *), regardless of the mixing probabilities. Collections of distributions 
with this property often arise from models common in the literature. Lawless, et al 
(1995) study failure data from automobile brake pads using a form of accelerated failure 
time model in which they form a time scale u = x^' 71 y{x) v , T] e [0,1], They assume linear 
usage paths y(x) = Ox , so that u = x8 n , and they fit a two-parameter Weibull distribution 
to failure times in scale u. Although their work does not pertain directly to age 
replacement theory, the resulting collection of distributions of XI 9 has this property. 
Duchesne and Lawless (2000), Gertsbakh and Kordonsky (1998), and Oakes (1995) 
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study linear time scales t = x + gy(x), g> 0. Under a linear path assumption, time scale t 
takes form x(l + gd). When a parametric distribution including a scale parameter is fit to 
failure times in scale t, the resulting collection of distributions of X\6> has this property. 

In certain cases, proportional hazard models can also produce collections of conditional 
distributions with this property. 


C. ESTIMATING THE OPTIMAL COMPOSITE POLICY 

We now turn to estimation under constraints (4.3). Assume { F is a collection of 

distributions with (Ti*, r 2 *,..., T m *) eA. Following (1.3), let S, denote the empirical 

survivor function based on the ordered sample chronological lifetimes 

*,•(!) < x ;(2 ) < ... <x, (n) from path i, where n, is the number of observations on path i, and 


let 


C,(T,) = 


(K + Q-CSjj^) 

f 0 ‘sM) du 


,Ti>0,i= 1,..., m. 


(4.4) 


Thus, C,.(r ; ) estimates C,•(?;). The following is the analog of (1.3) for the multiple-path 


scenario: 


C(r) = i:, PiCfiTi), Ti>0,i= 1,... ,m. 


(4.5) 


In the univariate problem, the fact that the empirical cost function (1.3) is a 
piecewise decreasing function reduces the search for the minimum to a finite number of 
“strategic” points. Similar principles apply to searching for a minimizer of C (t); let f 
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be such a minimizer, i.e., C (f) < C (r) for all Tin A. We now describe f and prove that 
it is globally optimal. 

For convenience, suppose that along each path no two failure times are equal, so 
that jc M i) < *M 2 ) < ... < x Uni ,; also let x m = 0and x. (n +1) = °°, * = 1,..., m. Form an 


m-dimensional grid 

r = X^I,(1) » '*' 1 .( 2 ) X i,(n t ) ) (4-6) 

1=1 

based on the observations along each path. In each m-dimensional hypercube of the form 


m (A n\ 

H= x( x uj,)’ x ui+»]’ where7. e {0,..., n,}, i = 1,... , m, (4./) 

i=l 

C (t) is decreasing in each argument; it follows that the minimum of C (t) in H occurs 
at the vertex (x uh+n ,x 2Xh+]) ,...,x m Um+i) ). Note this vertex dominates all other points in 

H with respect to the matrix partial order on (0,°°) m ; that is, 

^ (Wi)’ x T<; 2 +»’-"’W + >)) V Te H ■ Thus, to find the global minimum of C(r)in 

the absence of constraints, we evaluate C (t) at all such non-dominated vertices and 
select the one yielding the smallest cost. In the presence of constraints (4.3) defining set 
A, it seems reasonable to limit our search for T to the set of these vertices which lie in A, 
but it can be shown that checking only such vertices will not necessarily produce the 
global minimum. Such a procedure, though, will yield a point corresponding to an upper 
bound for the optimal cost. 
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Let H denote the set of all hypercubes H as in (4.7) for which HcA * 0. For 
some H in this set, the non-dominated vertex lies in A: for 

others, this vertex lies outside of A. In the latter case, the non-dominated point inWnA 
(i.c., the point that simultaneously maximizes the value of each coordinate) yields the 
smallest value of C (r). To find t , an enumeration procedure is utilized to find the non- 
dominated point, say u(H), in H n A for all H in H. Then, t = argmin C(u(H)) among 
all H e H. For each He H, the non-dominated point u(H) is constructed explicitly based 
on the following results. 


Proposition 4.2. For any x = (oti, jcj, ... , -r„,) in (0,°o)"\ let 
B x = {re (0,<»)'": r-<x). Define u(x) as follows: u(x) = («i(ar), «2 (at), ... , u,„(x)) where 


ii, (x) = min'! 
u 2 (x) = min 


■ T i> 




*2.7r*3. 



fm 

9 2 


X 


m 


u } (x) = min< . jc 2 ,..., x i , 


fm 

9. 


fa 

’ e. 


- -V„, 


= min{.v,, j: 2 . 

Then, (t) u(x) g An B x and (2 )y -< u{x) Vy g A n B x . 
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Proof: First, u(x) e B x since «;(*) <* / = 1,, m; that is, «(jt) -<x. To show 

«(*) e A it suffices to show OMx) < e t+l u M (x) and u,(x) > u M (x) for / = 1. m - I. 

Let <€ {1,1}. Since ft+i > ft, 


ft,/!, (jc) = ft, min 


ft, +1 ft 

V X v - ,+l v m v 

l ’ 2 ’-’" ft. . T n 


v ' -« ) 

= min (ft*,, ft .v 2 ,..., ft x., ft +l x ;+1 .ft„.v„,) 

< min(ft +|J r,,ft +1 .ft +1 x ; ,ft +l ,..., d m x m ) 

a ■ f ft.-i 2 ft 

= ft +l irnnj .v,, .v 2 ,..., x,, .v i+1 . 


ft, 


1+1 


0 , 


» + l 


ft+j M /+i (■*)> 


also 


«,(*) = min 


XX X —X -fllL V v 

12 ." ft ,+ ” ft X ^-'T " 


>min r,,f,. i„j w , 


ft, 


(+2 


/’•*/+!’ zi A 'i+2)-”< 


ft 


/♦I 


ft, 

ft, 


(+1 


- M . + l (*)• 


Thus, n(x) e AnB x , proving (1). To show (2), lety € A nB x , and let i e {1, ... ,m). 
Sincey e A, y, > y 2 > ... > y, and 6 m y m > . . .> ft +2 y (+2 > ft + ,y ;+ , 2 ft).,, then 
(ft«,/ft)v„, > ... ^ (ft Lit ft)y ,+2 > (ft+i/ft)y, + i > y„ so by definition of «(y), w,(y) =y,. It 
follows that u(y) = y. Sineey e ft,, it follows thaty,£.v„ / = I,... , m. Since each «,(z) is 
non-decreasing in each argument ofz e (O.oo)" 1 , W e havey = t/(y) -< u(x), as required. 


We now use this result to find u(H) for H e H. 
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Proposition 4.3. Let H as in (4.7) be a member of H, let jr = (jq , x 2 x m ) denote 

the vertex (At UA+l) , and let z= (z,,z 2 z m )denote the vertex 

(•*),< h >’ x uk )’■’■’^"..t/„>) • L® 1 U W ~ u(x) as in Proposition 4.2. Then (1 )y < u(H) 

Vy e An H, and (2) u{H) e A nH. 

Proof: Letu = (m,,w 2 = u{H). Let y e AnH (such ay exists, since 

A r> H*0). Since H<zB x it follows that y t Ar\B x ; from Proposition 4.2 we know that 
y<u, thus proving (1). By (1). we have y t <u it i= 1. ..., m. Since y e H, we know that 
Z,<yi<x„i= 1,..., in. Because u ■<x, we know u, <x,, i= 1,, m. From these 
inequalities it follows that zt < «, <.v„ i = 1,..., m, so that u(H) e II By Proposition 4.2 
we know u(H) e A; thus, we have shown (2). 

We now show that our procedure returns the global minimum of C (T). 

Theorem 4.1: C ( t ) £ C(r) Vre A. 

Proof: Let re A. Because the grid V defines a partition of the positive orthant, 
re H for some H g H. Form u(H) as described above. By definition, C ( t ) < C ( u(H)), 
so it remains to show C (u(H)) < C(r). By construction of u{H) we have v< u(H): in H, 
C (t) is decreasing in each argument, so it follows that C ( u(H)) < C(r), as required. 
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D. EXAMPLE 

Returning to the metal fatigue data from Case Study 3 of Chapter II, Table 4.1 
contains the policy vector f = (zj 2 .f Jfor r = 0.5 along with the values 6>f, to 

amplify the fact that Af ; is a member of In the policy for r = 0.75, f 2 = 15200 so 

that 9 2 t 2 =3800. All other components are identical to the policy for r = 0.5. The policy 
for r = 1 is identical to the policy for r = 0.75. Thus, for this data the procedure produces 
nested policies for these values of the cost ratio. Figure 4.1 contains a scattcrplot of the 
data overlaid with line segments representing paths curtailed by their corresponding 
replacement times for r = 0.5. 


i 

| Slope 6 , 

A 


1 

0.053 

23580 

1241 

2 

0.250 

r10300 

2575 

3 

0.667 

5700 

r 3800 

4 

1.500 

2666.67 

r 4000 

5 

4.000 

1000 

4000 

6 

19.00 275 

5225 


Table 4.1: Composite Policy for the Metal Data, r a 0.5. 

For example, row 5 Indicates that non-failed devices on a linear usage path 
of slope4are replaced when the number of low-load cycles accrued 
reaches 1000. At this time, the number of high-load cycles accrued is 4000 
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Figure 4.1: Metal Data with Policy for r = 0.5. 

The solid lines represent the failure replacement region for the policy with 
replacement time vector (23580,10300,5700, 2666.67,1000, 275). 

This example also sheds light on ways to reduce the computational burden of 

A 

finding t : it is often unnecessary to compute C at the non-dominated point in every 
H e H. We recommend first finding the unrestricted minimizerT . A basic optimization 
principle is that if the solution of a relaxation happens to satisfy a restriction, then it 
solves the restriction. This principle implies that if ¥ e A, then f = T . Thus, if the 
unrestricted minimizer lies in the set A, no further computation is necessary. Computing 
¥ can save computation even if ¥ £ A. In some cases, T may violate only one constraint 
defining the set A; restricting the coordinates causing the violation (while leaving the 
others relaxed) may lead to an optimal solution. More specifically, suppose that 
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f = (T 1 ,T 2 ,...,r m ) is such that for some k in 1}, either z k <z k+l or 

>0 k + i?w ^ K andr; +1 minimize 

Cjt > T *+l ) = C k (z k ) p k + C M ( T k+\) Pk+1 > 

subject to 

A k = {(r k ,r M ) e (0,°°) 2 : r*> t*+i, < ^ + ir* +] }. 

Let ?' denote the vector formed by replacing x k and z k+l in f with f k and z k+] , 
respectively. It can be shown that if f'& A, then f- f'. This approach works for the 
metal data for r = 0.5; recall from Case Study 3 that f violates one constraint defining 
set A. This approach applies sequentially on the metal data for r = 0.75; in this case T 
violates two constraints. 

E. COMPARISON WITH SCALE-COMBINING APPROACHES 

The scale-combining methods discussed in Chapter III differ fundamentally from 
our estimation procedure in their motivation, but in some cases produce sensible policies. 
The “best scale” method seeks the linear time scale t(a) = (1- a)x + ay(x), a e [0,1], with 
corresponding r(<s)-scale replacement time T a , that yields the lowest long-run average cost 
(per unit of chronological time, after “conversion”). The min CV method seeks the linear 
scale corresponding to the smallest lifetime CV. Both of these procedures use the data to 
produce a linear time scale and hence a policy of the form Mx, My, or a triangular set 
M a = {(x,y(x)): t(a) <T a }. In contrast, the policies produced by our procedure are 
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required only to be lower sets. This is a broader class of policies than those resulting 
from a linear scale. 

Ideal time scale methods seek the scale t such that P[T > t 0 1Z] does not depend 
on the path Z; hence, a policy based on an ITS has the property that the probability of 
failure before replacement in this scale is the same, regardless of the path. Most of the 
focus of Duchesne (1999) is on inference procedures for the parameters of ITS models 
which are either linear (i.e., t = x + gy(x), g > 0) or multiplicative (i.e., u = x 1 ' 71 y(x) n , 

0< 7]<l). In the case of linear paths with slopes $e { 6\,...,6 m }, these scales always 
result in sensible policies. To demonstrate this, suppose the data are reasonably 
described by a linear ITS model t = x + gy(x). The “best” scale for age replacement and 
min CV scale can be re-parameterized to this form. The policy takes the following form: 
replace non-failed devices when x + gy(x) = f. It follows that the replacement time 
vector is (f /(I + g6 i),..., f /(I + gO m )) e A. Restricting attention to preventive 
maintenance policies formed by ITS models may be appropriate in some cases; however, 
we have noted in Chapter III that the non-uniqueness of an ITS can cause problems for 
estimation of age replacement policies even when the ITS has a simple parametric form. 
Unfortunately, given a set of lifetime data (along linear paths or otherwise) it is rarely 
clear which (if any) parametric form the ITS should take. Duchesne (1999) suggests a 
non-parametric procedure for estimating the true, unknown ITS; this procedure links the 
quantiles along the paths. Policies based on the resulting scale can be constructed which 
are not lower sets. 
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In Chapter II we have noted that in the single-scale problem, policies 
corresponding to a sequence of decreasing cost ratios are “nested.” We have also 
observed that this quality is desirable for multiple-scale policies because non-nested, 
multiple-scale policies prescribe replacement times for devices on some paths that are 
inconsistent with respect to the corresponding cost ratios. We have also observed in 
Chapter III that policies based on either the min CV scale or on an ITS are nested, but 
policies based on the “best” scale for age replacement method are not guaranteed to be 
nested. Due to the nature of the single-scale cost function (1.3) and in turn (4.5), the 
policies produced by our procedure are not necessarily nested. However, we show in 
Chapter V that in practice, our procedure tends to produce nested policies even with 
small samples. In such cases, our procedure forms a time scale based on the cost ratio r. 
The points along each path corresponding to the replacement time for a given r have the 
same age in this scale. Also, in a manner analogous to the cost sensitivity analyses 
conducted with the aid of TTT plots, we find there are ranges of r over which the same 
composite policy is valid. Combined scales, on the other hand, essentially order the 
observations based on their lifetimes in the combined scales; points along contours of 
these scales are the same “age” in these scales, indicating they have, in a sense, 
accumulated the same level of damage. 

F. DISCUSSION AND SUMMARY 

In this chapter we developed a method of estimating the optimal “sensible” policy 
given lifetimes from a population of devices which age along linear paths. Under such a 
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policy, non-failed devices on path Z, are replaced when their chronological age reaches Xu 
i= l,, m. As such, this composite policy technically applies only to devices on these 
paths. Policies based on combined scales of the form considered in Chapter III have this 
same form when applied to data on linear paths. The assumption that devices age exactly 
along linear paths is usually an approximation of reality; thus, it is worthwhile to consider 
ways to extend these policies to ones that apply to devices on any path. The policy (0,r) 
in a combined scale t extends in a natural way to the region {(x,y(x)): t < x) in the 
positive quadrant, as exemplified in Figure 3.4 and Figure 3.5. 

The key consideration for extending the policy produced by our estimation 
procedure is to ensure that the resulting policy is a lower set with respect to the matrix 
partial order on (0,°°) 2 . Consider, for example, a population of devices aging along lines 
of slope 0\ = 0.5, (k = 2, or <9? = 8. Suppose that for some r > 0 the replacement times are 
X\ = 20, x~i - 10, and Xj = 5, respectively. The solid lines segments in Figure 4.2 represent 
the failure replacement region for this policy. To extend this policy to the positive 
quadrant, we need a non-increasing function on (0,°°) that is contained within the 
rectangular regions delimited by the dashed lines in Figure 4.2. This function induces a 
boundary of the failure replacement region; non-failed devices are replaced when their 
usage curve crosses this boundary. A “conservative” extension is to choose a step 
function coincident with the lower boundaries of the boxes; a more “aggressive” 
extension is to choose a step function coincident with the upper boundaries of the boxes 
(in this case there is no usage limit for devices with x < 5). Between the two extremes, 
we arbitrarily choose a smooth curve through the policy points {(20,10), (10,20), (5,40)}, 
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as depicted in Figure 4.2. We address the problem of determining the cost of 
implementing such policies in Chapter VI. 



Figure 4.2: Extension of Estimated Optimal Policy. 

The solid lines represent the failure replacement region for the policy with 
replacement time vector (20,10, 5), The dashed lines represent bounds for 
a non-increasing function serving as a policy boundary under the lower set 
restriction. The smooth curve represents the boundary of one possible 
extension of the policy based on the linear paths of slope 0.5, 2, and 8. 


Additionally, we note that our focus in this chapter has been on completely non- 
parametric estimation of the optimal policy. We acknowledge it is also possible to 
estimate the F, under the restriction that the estimates be IFR. Ingram and Scheaffer 
(1976), however, find little value added from the increased computational burden over 
empirical estimation. We remark that if parametric (or other nonparametric) distributions 
are fit to each F,- and a r, estimating tj* is found for a given r > 0, the vector 
(?, ,? 2 ) is not necessarily in A. It is possible, however, to estimate parameters of 
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certain collections {F,} under the restriction that ,—,f m ) be in A. Gertsbakh and 

Kordonsky (1998) consider an example of such a collection. They discuss estimation in 
the Weibull family under which the shape parameter is constant for all paths but the scale 
parameters are allowed to vary. Geurts (1983) acknowledges optimal age replacement 
times in the Weibull family are relatively insensitive to the shape parameter, so in our 
setting this seems to be a reasonable approach. In such a case, it can be shown that if the 
scale parameters satisfy conditions akin to (4.3), the resulting composite policy 
(f,, f 2 ,..., T m ) is in A. General conditions under which (?,, f 2 ,..., f m ) is in A need further 
study. 

Finally, in this chapter we focus on linear paths in two scales. The concepts 
developed here can be generalized to more than two scales. For example, m linear paths 
in k+ 1 scales can be represented by (x, y ] (x),..., y*(x)) where y/x) = 0 tJ x, i= 1,... ,m, j = 

1 For m such paths, as in two scales, an age replacement policy need only specify 
replacement ages (Ti,T 2 ,...,T OT ) in chronological time. In addition, the cost function (4.2) 
remains the same. The difference comes in specifying constraints (4.3) so that the policy 
(Ti,T 2 ,...,T m ) is indeed sensible in the original scales. We speculate that the constraints 
and the estimator will have form similar to those developed in this chapter. We do note, 
however, that it is difficult to imagine practical applications of extending these results to 
more than three scales. 
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V. PROPERTIES OF THE ESTIMATED OPTIMAL COMPOSITE POLICY 

In this chapter we address the properties of the policy f. We begin with a 
discussion of its large-sample properties, and then investigate its small-sample behavior 
through simulation. We conclude with simulation results aimed at comparing the 
performance of the policies produced by our procedure with those based on the min CV 
method. 

A. LARGE-SAMPLE PROPERTIES 

Let 5 ( . be a uniformly strongly consistent estimator of 5,-, i = For 

A 

example, if lifetimes along path i are from a simple random sample, then taking S ( to be 

the empirical survivor function (1.2) gives a non-parametric estimator of S„ which by the 
Glivenko-Cantelli lemma converges uniformly to 5, with probability 1. On the other 
hand, should lifetimes along path i be right-censored, depending on the censoring 

mechanism, the Kaplan-Meier estimator is an appropriate choice for S ( . With such an 
estimator and the assumption that Ti* <«« exists and is unique (e.g., if F, is IFR with 
failure rate strictly increasing to °°) then it is well known (e.g., Arunkumar, 1972) that f- 

minimizing C, (t;) is a strongly consistent estimator of %*. From this it follows, for the 
composite policy with replacement time vector f = ,..., T m ), that 

max It, — t,*| —» 0 

1</<OT 1 ' ' 1 
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with probability 1 as n, —> °°, for all i = 1, ... , m. This result, of course, does not require 
the estimated policy r to be in A even if (Ti*,... , T m *) is in A. 


The showing of the strong consistency of the estimator f, which is required to be 

A 

in A, takes a bit more care. With the individual Ti* < °° and unique, and S l a uniformly 
strongly consistent estimator of 5,-, then a small modification of Ingram and Scheaffer’s 
(1976) argument shows that C ( . converges uniformly to C, in an interval bounded away 
from zero with probability 1. In particular, Ingram and Scheaffer (1976) show that 
J o Sj ( u)du < °° by appealing to the condition that F, be EFR. However, this is also true if 
Tj* < °° and unique because for t> Ti*, 0 < Cj(T}*) < Ci(t ) and hence 

S: ( u)du -(K + C) lim(l/ C. (t)) < . For the multiple-scale functions C{t) and C(t), 

Jo t —»°o 


we have 


C(T)-C(T) = 


'Ll >,C,( T ,)-X>.C,<0 SX:,P, C,(r,)-C,(t,) 


Thus, for a > 0 we see that C (r) converges uniformly to C(t) in the m-dimensional 
region Suppose T<£ [a,°°) m , so that Tj< a for some j = 1 Then C( T ), and 

similarly C (r), are bounded below as follows: 

C(T)> PjCj(Tj) 

(.K + O-C 


ZPI 


J o Sj(u)du 


. K ■ 

> — mm D:. 

d \<i<m 
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An application of the multivariate analog of Theorem 1 of Arunkumar (1972, p. 252) then 
gives strong consistency of f, as an estimate of T*, as stated in the following theorem. 

Theorem 5.1. Let r > 0, (Ti*,... , % n *) e A be unique, where A is defined in 
(4.3), Ti* <°o , and 5 f be a uniformly strongly consistent estimator of 5„ i= 1, ... , m. 
Then 

max If. -T/* | ->0 

l<i<m 1 ' 1 

with probability 1 as each n,--><*>, i= 1,... ,m. 

We note that the proof of Theorem 5.1 does not require f to be unique. Indeed 
with 5, as the empirical survivor function, uniqueness of f is not guaranteed. In 
addition, although (Ti*,..., T m *) e A for most practical cases, this is not a strict 
requirement. What is required in the proof of Theorem 5.1 is the existence of a unique T* 
minimizing C(t) among Te A and that T* has finite elements. Weak convergence of f is 
not studied here. Arunkumar (1972) does develop the asymptotic distribution of the 
minimizer of (1.3) in the one-dimensional case. Perhaps Arunkumar’s approach can be 
used to establish weak convergence for the multi-dimensional, restricted estimator f. 

Furthermore, for large samples, the estimators of the optimal policies are nested. 
Suppose s < r, and let Tj*(s) and t;*(r) minimize C,(r;) with respective cost ratios s and r, 
i= 1 ,..., m. If (T t *(s), .... T m *(s)) -< (Ti*(r), ..., T m *(r)), the corresponding failure 
replacement regions are nested. Suppose both (Ti*(.y),..., T m *(s )) and 
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(n*(r),, T m *(r)) are in A and Ti*(s) < Tj*(r), i= 1,... , m. Then, it follows from 
Theorem 5.1 that with probability 1, for all n\, n 2 , ... , n m large enough, the estimated 
policies f (s) < f (r), and thus their corresponding failure replacement regions will be 
nested. 

B. SMALL-SAMPLE BEHAVIOR 
1. General Simulation Results 

We use simulation to gain insight into the behavior of the estimated cost function 
and policy for small sample sizes. In this simulation, devices have “low,’ medium, or 
“high” rates of use, corresponding to usage paths of slope = 1, (h = 2 or (h = 5. For 
each path, lifetimes arise from the Weibull distribution, with density 

R(t\ M ( ftY) 

/(/;/?, 9 >) = — — exp- ,t> 0. (5.2) 

<p{<p) {<P) 

As in the simulations of Ingram and Scheaffer (1976) we fix the shape parameter (3 = 2 
for each path. Gertsbakh and Kordonsky (1998) also assume the Weibull shape 
parameter is constant over paths. The scale parameter (p is varied for the three paths so 
that (px = 40/21, (p2 = 10/7, (pi = 1 for paths 1, 2, and 3, respectively. These scale 

parameters ensure (Ti*,T 2 *,? 3 *) lies in A for any r > 0. 

Four groups of simulations are performed to investigate the small-sample 

behavior of C (t) and f as sample sizes along paths n = («i, n^, rii), mixing probabilities 
P = (pu Pi,pi), and cost ratio r vary. Each group corresponds to realistic settings for n 
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and p. There are three runs within each group, to investigate the effects of varying r. 
Table 5.1 depicts the settings used in each run. 



Run 

n 

p _ 

r 

Group 1 

1 

" (5,5,5) 

(1/3,1/3,1/3) 

1.0 

2 

(5,5,5) 

(1/3,1/3,1/3) 

0.5 

3 

(5,5,5) 

(1/3,1/3,1/3) | 

0.1 

Group 2 

4 

(5,5,5) 

(0.1,0.8,0.1) 

1.0 

5 

(5,5,5) 

(0.1,0.8,0.1) 

0.5 

6 

(5,5,5) 

(0.1,0.8,0.1) 

0.1 

Group 3 

7 

(10,10,10) 

(1/3,1/3,1/3) 

1.0 

8 

(10,10,10) 

(1/3,1/3,1/3) 

0.5 

9 

(10,10,10) 

(1/3,1/3,1/3) 

0.1 

Group 4 

10 

(10,10,10) 

(0.1,0.8,0.1) 

1.0 

11 

(10,10,10) 

(0.1,0.8,0.1) 

0.5 

12 

(10,10,10) 

(0.1,0.8,0.1) 

0.1 | 


Table 5.1: Settings for General Simulation Runs 


Sample sizes of 5 and 10 are common, particularly in observational data or experiments 
designed to study the lifetime of high-cost prototypic devices. Mixing probabilities 
(1/3, 1/3, 1/3) represent populations for which devices are evenly spread across several 
usage rates; and mixing probabilities (0.1,0.8,0.1) represent populations for which a 
large majority of the devices have a “medium” rate of use (e.g., automobiles). Table 5.1 
contains runs for which the relative frequencies of the sample sizes along paths differ 
from the mixing probability vector since it is not uncommon for the mixture of test assets 
to differ from the mixture in the actual population. Finally, the cost ratios 1,0.5, and 0.1 
are common in the literature. 

Each run of the simulation consists of 200 replications. In replication j, we 
generate a data set consisting of n, Weibull(2,$) lifetimes, i = 1,2,3 and for this data set 
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we find f (/) corresponding to the given p and r using the procedure described in Chapter 
IV. The random number seed is set in advance for replicability. For each run, we 
compute several quantities to gain insight into the small-sample performance of f as an 
estimator of z*. Table 5.2 contains z*, the minimizer of the true cost function C(z), 

found numerically. It also lists Av(f) = (1/200)^T^ f (j) , an estimate of the expected 
value of f and the difference Av(f)- z*, an estimate of the bias of f. Finally, it 
includes p(t ), the proportion of the replications for which f = (t, , f 2 , T 3 ); this quantity 

reveals how often f e A and hence we find i “automatically,” with minimal 
computation. 


T* 

I Av{f) 

Av(f)- z* 

P( f ) 

l 

2.078 

1.558 

1.091 

2.005 





in 

am 

2 

1.406 

1.054 

0.738 

1.471 

1.048 

0.713 

j 



nm 

B 

0.607 

0.456 

0.319 

0.866 

m.-llifl 

0.407 

0.258 

0.151 

0.088 

0.210 

4 



mm 

2.180 

DEB 

0.973 

EB 



0.225 

K3 


ESS! 

asm 

1.607 

KEZZ1 

0.706 

EBB 



0.260 

B 





0.634 

0.424 

0.316 

0.179 

0.105 

0.210 

7 


1.558 


2.121 

1.545 

1.035 

0.044 

-0.013 

-0.056 

0.250 

B 





ifflfl 


0.061 

0.010 

0.009 

0.330 

B 





Kinii 

gjgfi 

0.146 

0.082 

0.038 

0.225 

ITfl 


1.558 



1.601 

agmiB 

0.223 



0.250 

li 

1.406 

1.054 

0.738 

1.514 



0.108 



ESB 

12 

0.607 

0.456 

0.319 

0.768 

itlitiBI 



0.095 


B 


Table 5.2: Small-Sample Performance of f 


First, by comparing rows 1-6 with rows 7-12 in Table 5.2, we note that increasing 
sample sizes generally results in an increase in the (estimated) accuracy of f. As 
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expected, increasing sample sizes increases the proportion of replications for which 
f = (t, , f 2 , t 3 ). To investigate the effect of a non-uniform p on f in small-sample 
situations, compare rows 1-3 with rows 4-6 and rows 7-9 with rows 10-12. In general, 
the accuracy of f decreases slightly, but this effect is reduced as the sample sizes 
increase. By examining columns 4-6 of the rows within each group, we note the 
“average” policies are nested. 

We proceed as follows to determine if the policies produced in each individual 
replication of a given run are nested. By setting the random seed, we generate the same 
lifetimes for each run in the first two groups and in the last two groups. Hence, for 
example, the estimated policies for replication j of runs 1, 2, and 3 are based on the same 
random numbers. For a fixed group, let f ^\r) denote the estimated policy for cost ratio r 
given the data for replication j. It can be shown that these policies are nested if 
f {j) (0. \)-<i O) (0.5) -c t w ( 1). For each of the four groups, we find that nesting occurs in 
each of the 200 replications. 

For each run, we also compute several quantities to gain insight into the small- 
sample performance of C (t) as an estimator of the true cost C(t). First, we compute 
C(r*), the exact cost of the true optimal policy, from (4.2). Next, we compute 
Av[C(i)] - (1 / 200)^ 2(X | C U) (t (i) ), an estimate of the expected estimated minimum cost 

of age replacement, and then the sample standard deviation of the C (f). We also 
compute Av[c(f)] = (1/200)5^ c(t U) ), an estimate of the expected true cost at the 

optimal policy. Finally, we compute b[C(f)]= Av[C(f)]~ Av[C(f)] and 
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MSe\c(t )] = (1 / 200) (c <j) (f 0) )— c(f 0) )) , estimates of the bias and MSE of C (f) 


as an estimator of C(f), respectively. These quantities are scaled by the factor 1/C and 
displayed in Table 5.3. 



C(t*) 



Av[C{t)] 

b[C{f)] 

MSE[C(f)] 

1 

| 1.618 


0.272 

1.679 

-0.199 

0.107 

2 

IggIB 



1.150 

-0.246 

0.094 

3 



HfllM 

0.518 

-0.247 

0.074 

in 

1.554 


0.351 


BfiflfB 

0.142 

5 



0.249 

mam 


■■SEH 

6 


UTOMI 

0.145 



1 

7 






0.052 

8 





-0.173 

0.049 I 

9 



■EH 

HEM 

-0.185 

0.042 

10 

1.554 

1.465 

0.242 

IBM 

mmm 

HEEEQHH 

11 

1.052 


0.178 

mmm 

mssm 

■ 

12 

0.454 

0.300 

0.115 

0.490 

-0.191 

0.047 


Table 5.3: Small-Sample Performance of C (f) 


As in Table 5.2, by comparing rows 1-6 with rows 7-12 of Table 5.3, we note 
that increasing the sample sizes results in an increase in the (estimated) accuracy and 
precision of C (r) as an estimator of C(t). To investigate the effect of a non-uniform p 
on C (t), compare rows 1-3 with rows 4-6 and rows 7-9 with rows 10-12. In general, the 
accuracy and precision of C (r) decreases slightly, but this effect is reduced as the 
sample sizes grow. 
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2. Results of Nesting Simulation 


We also use simulation to investigate in more detail the nesting tendency of the 
policies produced by our procedure. In the general simulation, we used the sequence of 
cost ratios {1, 0.5, 0.1}; in this simulation we use a more refined sequence {1, 0.9, 

...,0.1}. We retain the same slopes and Weibull parameters as in the general simulation. 
The nesting simulation consists of 4 runs of 20 replications each; for each replication we 
use a new random number seed. To investigate the effect of sub-sample size and mixing 
probability on nesting, we vary n and p between runs. The settings for n and p for the 
four runs coincide with the settings in groups 1-4 in Table 5.1 (i.e., run 1 has the same 
settings as in Group 1, and so on). In each replication of a given run, we generate n, 
Weibull(2,$) lifetimes, i = 1,2,3; for this data set we find i w (r) for each r in {1, 0.9, 
...,0.1} and we check whether f ^(O.IH f f 0) (1). For each run, we find 

that nesting occurs in each of the 20 replications. 

C. COMPARISON WITH MIN CV METHOD 

We further use simulation to gain insight into the performance of composite 
policies estimated using our procedure with in comparison with composite policies 
estimated using the min CV procedure. Here, we compare the true costs of the policies 
produced by the two procedures using the sample sizes, mixing probabilities, and cost 
ratios contained in Table 5.1. As in the general simulation, we use devices with usage 
paths of slope 6\ = 1, (h = 2, or ft = 5 and that X I# ~ Weibull(2, (pi), i = 1,2,3. But in 


69 




this simulation, the scale parameters #>, correspond with distributions for which the min 
CY method is expected to return reasonable estimates of (Ti*,T 2 *,T 3 *). 

Unlike our procedure, the min CV method is not designed specifically for the 
purpose of estimating (Ti*,T 2 *,T 3 *). Nonetheless, for certain families of conditional 
distributions, the policy based on the min CV method does in fact estimate (Ti*,T 2 *,T 3 *). 
Consider, for example, a population of devices on linear usage paths Z whose lifetimes 
correspond to the model 



That is, devices have lifetimes corresponding to the linear ITS model with time scale 
parameter y,. The times in the ITS have a Weibull distribution with shape parameter [5 
and scale parameter #> (ex: Duchesne and Lawless, 2000). It can be shown that along 
paths we haveX \6~ Weibull(/?,#V(1 + y 0 Q))- Suppose J3= 2, #> = 4, and y, - 3/5. It 
follows that X 1 6i ~ Weibull(2,#U where (p\ = 2.5, (p_ = 20/11, and #>3=1; these scale 
parameters are used throughout the study. These scale parameters ensure (Ti*,T 2 *,T 3 *) 
lies in A for any r > 0. 

For a given r > 0, our procedure always returns a policy with lower estimated cost 
than any other policy in (4.3). But since the true t* in this simulation corresponds to a 
triangular policy, and min CV restricts attention to such policies, we would expect the 
policy based on the min CV scale to have lower actual cost than our estimated policy. 

We find, though, that our procedure compares favorably in terms of true costs also. The 


70 



12 runs of this simulation use the n,p, and r as described in Table 5.1; each run of the 
simulation consists of 200 replications. In a given replication, we generate n, lifetimes 
from Weibull(2,$), i = 1,2,3. From this data set we compute a , resulting in f C v, the 
policy produced by the min CV method. We also compute T using our method. Hence, 
the result of each run are pairs (f= 1,... , 200. For each run, we compute 
C(t) at each of these values and (due to occasional non-normality) perform a Wilcoxon 
signed-rank test on the differences C(fg) - C( i U) )J= 1,... ,200. For every run we 

reject the null hypothesis that the true mean difference is non-positive; approximate 
/^-values are 0 in each case. In fact, our estimator results in a lower-cost policy in 67% to 
85% of the 200 replications for each run. 
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VI. POLICIES GIVEN DATA FROM UNKNOWN USAGE PATHS 


Assume that (X,Y) has support X = and that usage paths are unknown. 
Unlike the setting with known linear usage paths, there is no natural way to write the cost 
function in terms of one-dimensional cost functions and still be able to compute the cost 
for any policy M in M x . Approaches that use combined scales reduce the cost function to 
a one-dimensional cost function in the combined scale, but they do so by restricting 
policies to classes of nested policies. Combined scale approaches do not lend themselves 
to comparison of policies that are not nested. In this chapter, we develop a cost function 
that is a natural generalization of the one-dimensional cost function (1.1) and can be 
applied to all policies in M x . 

In the single-scale problem, the cost function (1.1) has the interpretation “long- 
run average cost per unit of time in use,” and arises in a relatively natural way from 
univariate renewal theory. Under a joint model for (A, F)- it seems reasonable to consider 
a cost function of the same nature as (1.1), with interpretation “long-run average cost per 
unit of time in use,” where “time in use” can be measured in chronological time or usage 
(e.g., flight hours or landings). In practice, budgets are often made with respect to 
chronological time, rather than usage. With this in mind, the cost function we develop 
has dimension cost per unit of use in chronological time. It does, however, incorporate 
both scales and could easily be taken to be cost per unit of usage. 

As in previous chapters, we consider policies M in Xt x under which a device is 
replaced upon failure or when its usage path crosses the boundary of M, whichever 
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occurs first. We develop the two-dimensional renewal reward process as the foundation 
on which we base the cost function for policies in M x . For a given set of failure times 
we then demonstrate how to estimate an optimal rectangular policy in 
M x , and conclude with an example. 

A. THE TWO-DIMENSIONAL RENEWAL REWARD PROCESS 

The cost function that we develop arises from considering renewal reward 
processes (see Appendix A) in two dimensions. Let R(n,v) denote the rectangle 
[0,w] x [0,v] and u > 0, v > 0. A stochastic process {N(u,v)', u > 0, v > 0} is said to be a 
two-dimensional counting process if N(u,v ) represents the total number of events that 
have occurred in R(w,v). Let {(t/,,V,)} be a sequence of independent and identically 

distributed (iid) non-negative random vectors, and let S ( n i} = I/ ( . and S (2) = ^" =| V' . 

Define N(u,v ) = max{n: S< u , S (2) < v }. Then { N(u,v ); u > 0, v > 0} is also a two- 
dimensional renewal process (e.g., Hunter 1974a). Both {£/,} and {V,} define univariate 
renewal processes. With N™ = max{n: S™ <u } and N^ 2) = max{n: S (2) < v }, it is 
readily seen that N(u,v ) = min{ N (2) }. Let R n denote the reward earned at the n th 

renewal. Assume the R n ,n> 1 are iid; note R n may depend on (U n ,V n ). Let 
Z(m,v)= ^l^R n represent the total reward earned in R(«,v). Then (Z(m,v); u > 0, 

v > 0} is a two-dimensional renewal reward process. 

Now, we generalize the univariate Renewal Reward Theorem. Let 
fi\ = £[£/i] < °o and pti - E[V\] < oo; suppose also E[/?i] <«». Given a one-dimensional 
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renewal process {//(?); t > 0} with mean inter-renewal time ju, it is well known that the 
total number of renewals N(°°) is infinite (e.g., Ross, 1997). For a two-dimensional 
renewal process, let N( °°,©o) be the number of renewals in a square of infinite size; that 
is, N( 00 , 00 ) = lim N(t, t ). We show that N( °°,°o ) cannot be finite. 

f —>co 


Lemma 6.1: N( co )0 o ) = oo with probability 1. 

Proof: This proof is a generalization of Ross’s (1997, p. 353) proof for the one¬ 
dimensional case. 


P{n(oo, oo)< oo}= P\X n =»o or Y n =oo for some n\ 


= P 


Q{X,=~ or K,=~} 


«= 1 


or Y„ =~}=0. 


n =1 


The result follows by complementation. 


Given a renewal process {N(t)', t > 0} with mean inter-renewal time //, it is also 
well known that limiV(r)/t = \fjl , with probability 1 (e.g., Ross, 1997). That is, the rate 

/ —>oo 

at which N(t) goes to infinity is the reciprocal of the mean inter-renewal time, with 
probability 1. The following result considers the rate at which a two-dimensional 
renewal process goes to infinity. 

Lemma 6.2: lim N(t,t)/t = l/ma x{ju,,fi 2 }, with probability 1. 

t~> °o ^ 
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Proof: For any fixed t, N(t,t ) = min{v, (l) , V, (2) }. Also, for fixed t, 
min{iV, (1) , Nj 1] }/t = min{w, (1) / f ,n\ 2) /t}. Since lim N\ x) /t = 1/^, and Jim N^/t = l/pt 2 

with probability 1, it follows that limmin{iV f (1) /r, N^/t} = min{l j,\jn 2 ] with 

t—>oo 

probability 1. 

Next, we generalize the Renewal Reward Theorem. 

Theorem 6.1: limZ(f,f)/f = with probability 1. 

f— 

Proof: Decompose Z(t,t)lt as the product of ^ ^R lt /N(t,t) and N(t, t)/t. By 

Lemma 6.1 and the Strong Law of Large Numbers the first term goes to £[Ri] with 
probability 1. By Lemma 6.2 the second term goes to l/ma x{n v M 2 } with probability 1. 

B. DEVELOPMENT OF COST FUNCTION FOR TWO-SCALE POLICIES 

We must modify the above results slightly before they can be applied to the 
setting in which the components of the two-dimensional inter-renewal times {(Vf)} <rr e 
measured in different scales. In the case of two parallel time scales, the time units of the 
mean inter-renewal times in the denominator are not directly comparable. However, if 
we “convert” time in the usage scale (e.g., landings) to chronological time, we obtain a 
meaningful denominator. To this end, we prove a corollary to the theorem. 



Corollary 6.1: For a > 0, b > 0, lim Z{at,bt)/t = £[#, ]/max{/<, /a,fx 2 /b} with 
probability 1. 

Proof: From {(£/,-, V;}) form the new renewal process {(Wi,Z,)} where W, = UJa 
andZ, = Vt/b. Let r„ (,) = = S<% and T?'=Yl*Z, =S?/b. Let 

N{t,tj =msx{n:T!;' ) Zt,T® Zt). A$E[W,] = fija and E[V t ] = fijb, we have 
limN{t,t) /t = l/ma\{fija,fi 2 /b} with probability 1 from Lemma 6.2. But 

/—>oo 

=max{n:5f 1) <at, 5* 2) <fct} 

= N(at,bt). 

This line of reasoning is essentially identical to Hunter’s derivation of the limiting growth 
rate of E[N(at,bt)] (1974b, pp. 555-6). The result follows immediately, using this fact 
and the decomposition technique from the proof of Theorem 6.1. 

Now we are positioned to use the results and discussion above to develop the 
function with which we can compute the cost for a given member M of set M x . Consider 
the one-dimensional case in which a device has lifetime X and operates under the age 
replacement policy (0, T). Recall the interpretation of the objective function (1.1). the 
long-run average cost per unit of “time in use” of implementing policy (0 ,t). Here, the 
“time in use” corresponding to lifetime X is simply the replacement time min{X,T}. 

Now, consider the two-dimensional case in which a device has lifetime (X,Y) and 
operates under policy M £ T/j-. We seek an objective function with a similar 
interpretation, but now “time in use” is more problematic. Let (U,V) denote the 
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replacement time under policy M. We consider two cases. First, suppose (X,Y) sM. 

This means that the device failed before crossing the boundary of M, so clearly 
( U,V) = ( X,Y ). Thus, its “time in use” is U = X and V = Y, and its two-dimensional 
replacement time is simply ( X,Y ). Second, suppose (X,Y) £ M. We know the device 
begins its life at (0,0). As it ages, it traces out a usage path terminating at (A,F), which, 
by assumption, lies outside of M. At some point, its usage curve crossed the boundary of 
M. Had policy M been in place, its “time in use” in both scales would be the point at 
which the usage path crossed the boundary of M. But by assumption we only know (X,Y) 
and M, not its usage path. Since usage paths are often approximated by a straight line, we 
adopt the following convention: let (U,V) be the point of intersection of the boundary of 
M and the chord connecting (0,0) to (X,Y). We describe ( U,V) in either case as follows: 

U = sup {x < X : (x, (Y / X )x) e M }, and 

( 6 . 2 ) 

V = (Y / X)U. 

We now construct the two-dimensional renewal reward process for a device 
operating under policy M e M x , We are given two-dimensional failure times (Xi,Fi), 
(Xi,Y 2 ), ... iid from some bivariate lifetime distribution F; thus {(£//,V,)} are iid. Let 
R(m,v) denote [0,m] x [0, v]. Let N(u,v ) represent the total number of replacements made 
in R(m,v). Since the {(t/„V/)} are iid, {N(u,v); u > 0, v > 0} is a two-dimensional renewal 
process. As in the one-dimensional case, let the “reward” (i.e., cost for replacement) R be 
K if replaced due to age and (K + C) if replaced due to failure. Let Z(u,v) represent the 
total cost incurred in R(m,v). Then, (Z(w,v); u > 0, v > 0} is a two-dimensional renewal 
reward process, with inter-renewal times {(t/„V,)}, rewards R, =K + CI[ ( X t , ^) e M ], 
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and Z(u, v) = . Recall the cost of policy (0 ,t) in the one-dimensional case is 

C(t)= \imZ(t)/t = E[R l ]/E[U i ], as discussed in Appendix A. To obtain a similar 

t—yoo 

limiting result for the situation we have just described, we apply Corollary 6.1. Thus, let 
a = 1 and b = E[Y]/E[X]; let fi x {M) = E[U] and ju 2 (M) = E[V\. From Corollary 6.1, 

C(M ) = \imZ(x,bx)/x - EfRj/maxl^M), /u 2 (M)/b}, (6.3) 

with dimension cost per unit of chronological time. The coefficient b in (6.3) is 
motivated from the “conversion factor” used by Kordonsky and Gertsbakh (1994), and 
can be interpreted as follows. From a reliability standpoint, one unit of usage is worth 
E[X\IE[Y] units of chronological time, on average. 

To “solve” the multiple-scale age replacement problem in this setting, we must 
find the M* in M x which minimizes this expression. We now demonstrate how to solve 
the appropriate optimization problem for a specific subset of M x . 

C. FINDING THE BEST RECTANGULAR POLICY 

The aim of this section is to search over the set Mr = {R(5,f): s > 0, t > 0}, the set 
of all “lower rectangular” policies (0,5) x (0,r). Observe Mr c M x . The set of lower 
rectangles is attractive since rectangular policies are easily implemented: a device is 
replaced upon failure or when its elapsed chronological time or cumulative usage reaches 
some “limit.” Hence, rectangular policies are closely akin to automobile warranties. In 
this section, we derive the form of the cost function for a given rectangle and describe the 
minimizer of the cost function formed when F is estimated by the empirical distribution 
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on the bivariate data. For the same reasons as in the univariate problem, it is convenient 
to define F(x,y ) = P(X < x,Y < y) for (x,y) in X. We now calculate the numerator and 
denominator in (6.3) for C(s,t), the cost when M = R (s,t). 

We find the numerator of C(s,t) in a manner similar to the single-scale case. 
Define reward R by 


R = 


K + C if (X,Y)e (0,s)x(0,t) 
K if (X,y)£ (0,$)x(0,0 


(6.4) 


Thus, the numerator is £[/?] ={K + Q F(s,t ) + K (1 - F(s,t)) = K+CF(s,t). 

To compute the denominator, let jU\(s,t) = E[U] and Jii 2 (s,t) = E[V\, where U and V 
are defined as in (6.2). For a fixed ( s,t ) in X, letA|(s,0 = (0,^) x (0,0, 


A 2 (s,t) = { (x,y) e X. y>t and y > {t/s)x}, and A 2 (s,t) = {(x,y) e X: x > s and y < (t/s)xj. 


In what follows the parameters (s,t) are omitted from these sets to simplify notation. 


Observe that these regions form a partition of X. From (6.2), we find 


U = 


IX, if (X,Y)eA x 
\tX!Y, if (X,Y)£ A 2 . 


[s, if (X,Y)eA 3 


(6.5) 


Thus, 


//, (s, t) = JJ xdF(x, y) + JJ (tx / y)dF(x, y) + 

At A 2 


JJ^F(x, y). 

A 3 


( 6 . 6 ) 


Similarly, 


F = 


Y, if (X,Y)eA l 

<t, if (X,Y)eA 2 , 

sY/X ,if (Z,F)e A 3 


(6.7) 
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and it follows that 


jl 2 ( 5 , t) = JJ ydF(x, y ) +JJ tdF(x, y) + JJ (sy / x)dF(x, y) . (6.8) 

A A 2 A } 


Assembling the parts, we find that the cost when M = R(s,r) is 


C(s,t) = - + CF ( s ’ f ) 


ma x{juds,t),ju 2 (s,t)/b} 


(6.9) 


When F is estimated by a discrete bivariate distribution with mass p, on {(x, j,), 
i = 1,..., n}, such as the empirical distribution, C(s,t) is estimated as follows. Let ///) 
denote the indicator function on set Aj for i = 1,... , n and j in 1,2,3. That is, 


= if (*,-,?,)€ Aj 

* [0, otherwise 

Then, it can be shown that the quantities £[/?], /J.i{s,t) and b are estimated by 


( 6 . 10 ) 


E[R] = K + Cj; i=1 I l (0p i , (6.11) 

fit (j, t) = i; =I [X; 7, (0 + (tX'/y, )/ 2 (0 + sl 3 (i)] Pi , (6.12) 

fi 2 (*’ 0 = Xw 7 1 (0 +' 7 2 (0 + (*XA )/ 3 (OJp,, and (6.13) 

t-'2L>,p.rZL*,r,- («■>« 


We substitute these into (6.9), obtaining 


C(j,0 = - 


£[/?] 


(6.15) 


Let us now explain how to find the minimizing value of (6.15). Recall that to 
solve the one-dimensional problem it suffices to evaluate C (t) in (1.3) at each of the 

A 

observations. We apply a similar strategy to find the minimizer of C (s,t). Because (1.3) 
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and (6.15) are developed in a similar manner, it is tempting to think that to find the 
minimizer of C ( s,t ), it suffices simply to evaluate (6.15) at (x„y,), i= l,... ,n, and select 
the two-dimensional failure time with the smallest cost. Upon closer examination, we 
find that it is necessary to evaluate (6.15) at other points in addition to the two- 
dimensional failure times. Let z be such a minimizer, i.e., C (£) < C (s,t) for all (s,t) in 
X. We now describe how to find z. 

For convenience, suppose that no chronological failure times share the same 
value, so that the chronological failure times can be strictly ordered x (1) < x (2) <... < x (n) , 

and similarly suppose the usage failure times can be ordered y (1) < y (2) < ••• < y^„) ■ Let 
*(0) = 0= y m and x (n+1) = oo = y (n+t) . Form a grid 

r={ Xjjj, X( 2 ) ,•••>}x{ y ( i),y ( 2 ),->yo,) }• (6.16) 

Note r defines a partition of = (0,°°) 2 into rectangles of the form (X(,),X(,+i)] x (y(/),y(/+i)], U 
je {0,Let n(s,t) = £[/?] and J(.y,r) = max{/},(*,!), fi 2 (s,t)/b } from (6.11), 
(6.12), (6.13) and (6.14). Consider the numerator. Note that n(s,t ) is constant on every 
(X(,)pc(,+i)] x (y(fl,y<j+i)], continuous from the left in s for all t, continuous from the left in t 
for all s, and non-decreasing in both s and t with jumps that can only occur on the north 
and east boundaries of the (x ( ,)^C(/+i)] x (yo^O'+bL Consider the denominator. We have 
£,(5,0 = I”, q,(s,t) Pi , where qM = x,.7, (0 + {txjy ,)/ 2 (0 + ^/ 3 (/). It can be shown 
that q t (s,t) and hence p ., (s,t) is continuous and non-decreasing in both s and t. 

Likewise, p 2 (s,t) = ^" =| r t (s,t)p t .where n(s,t) = y,7, (/) + tl 2 (/) + (sy i /x i )/ 3 (/). It can 
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also be shown that r,(s,r), and hence fx 2 (s,t)Ib, is continuous and non-decreasing in both 

s and t. Thus d(s,t) is continuous and non-decreasing in s and t. 

On each x (y^y^+ 1 )], the ratio n(s,t)/d(s,t) is thus continuous and non¬ 

increasing in 5 and t and therefore has minimum value at (x u +\), yu+i))- By a careful 
examination of the cost function it can be shown that C (x„y( n )) ^ C ( Xj, y (n > + y), 
i=l ... n for y > 0 and C(x(n),yj) — C(X ( n ) + x,y } ), j = 1 ... n for x>0. As such, it is not 
necessary to search beyond the outermost point of the grid, namely (x („), y\n)}- These 
points are gathered into the following result. 

Theorem 6.2: Consider the probability distribution which places mass/?, on 
( Xi ,yi), i = 1,..., n. Let C(s,t) be defined as in (6.15) and T as in (6.16), and 
z = argminC (s,t). Then, z e T. 


D. EXAMPLE 

Returning to the jet engine and automobile data sets, Table 6.1 contains z for the 

A 

cost ratios r = 1.0, 0.5, and 0.1 when F is estimated by the empirical distribution F . 

A 

Beneath z in each cell is F (z). 
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r= 1 

II 

O 

Lh 

r = 0.1 

Jet Engine 

(4932,2426) 

0.238 

(3227,1550) 

0.000 

(3227,1550) 

0.000 

Automobile 

(330,10300) 

0.578 

(368,8000) 

0.421 

(68,8400) 

0.053 


Table 6.1: Rectangular Policies for Various Cost Ratios. 
Parenthetical entries in the cells represent the optimal policy 
corresponding to a particular value of the cost ratio r. Beneath each such 
entry is the value of the empirical distribution at this point. 


We make the following observations from Table 6.1. First, as indicated by the values 

A 

F (£), more conservative policies are selected as r decreases (under more conservative 
policies, devices have a smaller chance of failure before replacement). However, the 
policies are not always nested; in particular, for the automobile data, the policy for r = 0.1 
is not contained in the policy for r = 0.5. Also, none of the £ correspond with 
observations, thus amplifying the need to evaluate the estimated cost function at all points 
in the grid T. Figure 6.1 depicts the policies for the jet engine data. Note from Table 6.1 
that the policy for r = 0.5 is identical to the policy for r = 0.1, and that this policy is 
nested within the policy for r = 1. 
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Figure 6.1: Rectangular Policies for Jet Engine Data. 

The dashed lines represent the boundaries of the policies for r = 0.5 and 1. 


E. DISCUSSION AND SUMMARY 

In this chapter we developed the two-dimensional renewal reward process, and it 
served as the foundation on which to build the cost function for policies in M x under a 
joint model for (X,Y). The cost function arises from the analog of the univariate Renewal 
Reward Theorem, and has dimension cost per unit of chronological time in use, much 
like (1.1). In the latter half of this chapter, we derived the form of the cost function for 
rectangular policies and showed how to find the rectangular policy with lowest cost given 

a set of bivariate failure data. The notions developed in this chapter are easily extended 

1 

to policies based on more than two scales. 
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We do not claim the policy £ produced by this procedure is an estimate of a true 
optimal z* for the underlying F. Unlike the case of several linear paths, we have yet to 
find examples of non-trivial bivariate distributions for which an optimal z* or an 
equivalence class of such policies exists. The closest work in the literature is that of 
Murthy et al (1995) in which the parameters of the optimal rectangular warranty policy 
are found for certain named bivariate distributions, but the cost functions used to define 
“optimal” are very different in nature from ours. Perhaps certain bivariate notions of 
aging (e.g., bivariate IFR, etc.) can be used to identify distributions for which a z* exists. 
Also, under additional conditions, it may be possible to show that £ converges to z* ■ If 
such distributions can be identified, simulation studies can be conducted to verify the 
small-sample properties of £. 
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VII. CONCLUSIONS 


In this dissertation, we generalize the classical age replacement policy to the case 
in which the age of a device is recorded in more than one time scale. We use several case 
studies to motivate the form of a general replacement policy in multiple scales. The case 
studies demonstrate the need for careful consideration in developing such policies. In the 
first two, we notice that in some situations, simply ignoring the usage scale may not be 
problematic, but in others, failure times in one single scale (e.g., chronological time) may 
not capture the entire damage accumulation process. The third case study reveals that a 
naive (though seemingly sensible) approach for data lying along linear paths can result in 
a policy that, although “optimal” from the standpoint of (estimated) costs, is not sensible 
from the standpoint of implementation. Based on these observations, we describe a class 
of policies that are sensible from the standpoint of implementation. This class 
generalizes multiple-scale policies found in the literature. Furthermore, we find it is 
desirable for multiple-scale policies to be nested when considering (in sensitivity 
analyses, for example) a decreasing sequence of cost ratios; otherwise, the replacement 
times prescribed by the policies can be inconsistent with the interpretation of the cost 
ratios. 

When failure times are recorded in multiple scales, it becomes readily apparent 
that identical devices do not operate under identical field conditions. Researchers are 
grappling with ways to use such lifetime data to produce comprehensive models, and 
some are seeking to use these models in the arena of optimal preventive maintenance. 
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Methods for developing preventive maintenance policies for such devices fall on a 
continuum ranging between two extremes. One extreme, as noted by Kordonsky and 
Gertsbakh (1997) is to provide an individualized policy for every single device in the 
population. They note such an approach is totally impractical and, as a result, 
unacceptable. The other extreme is the “one-size-fits-all” approach, in which the 
“optimal” policy is based on fitting a single distribution to observations which, in reality, 
may come from a mixture; this policy is then applied to the entire population. Basing a 
policy on a combined scale falls in between these extremes in that data in two scales are 
modeled by a univariate distribution in some “optimal” scale. As expressed by 
Kordonsky and Gertsbakh (1997), the goal of such approaches is to find a scale in which 
maintenance actions can be described “in a unified way which would fit all exemplars 
and would cover all operational conditions.” We carefully examine policies based on 
combined scales arising from three approaches in the literature in light of “desirable” 
properties. We find that each of the three approaches lacks features important when 
developing multiple-scale policies. In one approach, the observations are translated into 
many different scales and the scale corresponding to the minimum value of a “converted” 
cost function is defined to be “best.” This approach, although motivated from the 
standpoint of minimizing costs, does not guarantee nested policies in the original scales. 
In the second approach, a combined scale is found in a manner unrelated to maintenance 
costs. Policies based on this scale have the same “shape” and are nested. The third 
approach also restricts the form of the policy in a manner unrelated to costs. This 


88 


approach, although appropriate in some preventive maintenance contexts, does not seem 
best suited for age replacement. 

We consider multiple-scale age replacement in two settings. In the first, since it is 
common in the literature to approximate unknown usage paths with straight lines, we 
develop a procedure based on the assumption that devices age along linear paths. Like 
the scale-combining approaches, our approach lies between the extremes in that it can 
result in different policies for devices on different usage paths. However, our procedure 
does not rely on finding an “optimal” scale. Instead, it considers the lifetime distributions 
corresponding to devices on different paths in a manner resulting in an estimate of the 
optimal policy among a class of “sensible” policies. We show that under mild conditions, 
the estimated optimal replacement times are strongly consistent estimators of the true 
optimal replacement times, and then show by simulation that these estimates are well- 
behaved in small-sample situations. It is also shown that our procedure tends to produce 
policies having lower true cost than those based on the min CV method. 

In the second setting, device usage paths are unknown. We define the two- 
dimensional renewal reward process, and prove a two-dimensional version of the 
Renewal Reward Theorem. Using this result, we develop the cost function by which we 
can evaluate various policies under the assumption of a joint model for the bivariate 
failure times. We also derive the form of the cost function for a smaller class of 
alternatives and present numerical results obtained from solving the corresponding 
optimization problem for various two-dimensional failure data sets. 
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We note that our contributions may seem to fall in the area known as 
“multivariate age replacement.” The literature in this realm, however, differs 
significantly from ours. In this literature, “multivariate age” refers to the ages of several 
components, where age is measured in a single scale. For example, Ebrahimi (1997) 
defines MAR(7j,, 7*), the policy for multivariate age replacement for a system of k 
components which replaces component i, i = 1,2,..., k either at age T, or upon its 
failure. For the case k = 2, Ebrahimi explains how to find the optimal MAR(r,7) for both 
series and parallel systems. Heinrich and Jensen (1996) also discuss optimal replacement 
in a two-component parallel system, as does Scheaffer (1975). 

Numerous extensions to the dissertation research present themselves. Throughout 
this dissertation our main focus has been on data consisting of ordered pairs representing 
the chronological age at failure and the cumulative usage at failure. In some cases (e.g., 
the aircraft wing joint we mention in the Introduction) more than one measure of usage 
may be available; in other cases, values of other external covariates thought to impact the 
failure process may be available. The concept of a lower set generalizes to higher 
dimensions, and the problem of incorporating additional external covariates into policy 
estimation is worthy of consideration. In fact, as noted in the Introduction, the definition 
of time scale is general enough to include such cases. In the single-scale realm, Love and 
Guo (1991) and Kumar and Westberg (1997) present methods for obtaining age 
replacement policies for a pressure gauge given covariate information (the data set can be 
found in Appendix B). Both of these use a parametric model to incorporate the effect of 
the covariate on gauge lifetime. The work of Makis and Jardine (1992, 1999) in the 
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single-scale realm is more comprehensive. They recommend a combination of age 
replacement and “condition-based” replacement in hopes of obtaining replacement 
decisions that are more accurate than by employing one approach or the other. The 
foundation of their work is the Cox proportional hazard model (PHM) with time- 
dependent covariates. Given a data set of the form considered in this dissertation, we can 
obtain (in concept) a multiple-scale replacement policy by treating the measurements 
from the second time scale as the time-dependent covariate. Duchesne (1999), however, 
remarks that “because models with covariates treat the time variable and the covariates 
quite asymmetrically, it is not recommended to choose an arbitrary scale as the main 
scale and the other scale as covariates.” Farewell and Cox (1979) issue a similar 
warning. Of course, one can conceive of a situation where a wealth of information is 
available at device failure, including time in various scales and numerous condition 
measurements (some of which may be interval covariates such as measures of wear). In 
such cases, we echo Duchesne’s (1999) call for methods for the systematic identification 
of information categories for inclusion in models for device failure. 

The procedure developed in Chapter IV relies on the assumption that for a given 
r > 0, the collection of conditional distributions {F,} has unique and finite 
(Ti*,T 26 A. Further investigation is needed to characterize families with this 
property. This would provide a means for checking model assumptions before applying 
the procedure. We note that stochastic ordering (or even the stronger failure rate 
ordering) of the conditional lifetimes is not sufficient to guarantee (Ti*, T 2 *,..., r m *) e A. 
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In addition, numerous extensions were made to the basic problem with cost 
function (1.1) in the years following its initial development, as noted in the surveys by 
McCall (1965), Pierskalla and Voelker (1976), and Valdez-Flores and Feldman (1989). 
Such extensions as cost discounting, imperfect repair, and others are also viable research 
topics for the multiple-scale problem. 

The cost function (4.2) by which we define the “optimal” composite policy is of 
the “average of cost functions” form considered by Gertsbakh and Kordonsky (1997). 
Letting R denote the “reward” (cost) of a replacement and U the replacement time, the 
estimation of the optimal policy based on a “true” reward functional of the form 
E[R]IE[U] for the linear path case would also be a worthwhile pursuit. Here £17?] and 
E[U] could be found by a conditioning approach (e.g., Ross, 1997). This function has a 
slightly different interpretation than the one in (4.2), and is closely related to (6.3). 

Finally, we note much can be built on the foundation created in Chapter VI, where 
we focus on non-parametric policy estimation for the case in which observations do not 
fall on linear paths. For example, we concentrate specifically on rectangles. While such 
policies are easily implemented, it is conceivable that other members of M x may result in 
lower cost than the “best” rectangular policy (if it exists) for a given F. For instance, for 
some F, the class of policies bounded by the quantile curves of F may be worthy of 
consideration. Under such a policy (much like the policies based on an ideal time scale) 
the probability of failure before replacement would be identical for devices on any usage 
path. However, implementation may be difficult due to the shape of such a policy. It 
may also be fruitful to consider clustering methods for the case of unknown usage paths. 
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In such an approach, observations (X,Y) could be clustered by their (estimated) usage rate 
Y/X and then projected onto the line with slope corresponding to their respective cluster 
center. With the data in this form, the techniques of Chapter IV could then be applied to 
the “projected” data. A similar approach was suggested by Duchesne (1999) for non- 
parametric estimation of the ITS. 
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APPENDIX A: RENEWAL THEORETIC DEFINITIONS AND DERIVATION OF 

COST FUNCTION 

A. DEFINITIONS 

The following renewal theoretic definitions are from Ross (1997). A stochastic 
process {N(t)', t > 0} is a counting process if N(t) represents the total number of events 
that have occurred up to time t. Let [N(t)\ t> 0} be a counting process and let X n denote 
the time between the ( n -l) sl and n th event of this process, n> 1 (henceforth these times 
will be called “inter-renewal times”). If the inter-renewal times {X,,} are independent and 
identically distributed (iid), the counting process {N(t)-, t> 0} is a renewal process ; a 
“renewal” has taken place when an event has occurred. Given a renewal process 
{N{t)\ t > 0} with inter-renewal times {X„}, let R n denote the reward earned at the time of 

the n ,h renewal. Assume the R n , n > 1 are iid; R n may depend on X n . Let Z(t ) = 

represent the total reward earned up to time t; {Z(t); t > 0} is a renewal reward process. 

B. DERIVATION OF SINGLE-SCALE COST FUNCTION 

Consider a device which is maintained under an age replacement policy; that is, 
the device is replaced upon failure or when it reaches age T, whichever comes first 
(assume the replacement time is negligible). For example, consider a large supply of 
identical light bulbs. Upon failure, a light bulb is replaced instantly; operating conditions 
remain identical from one light bulb to the next. Assume replacement devices are as 
good as new. Let X n denote the lifetime of the n th device; assume X\, X 2 , ... are iid with 
distribution function F and survivor function S. For simplicity, assume F is absolutely 
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continuous with density/; Nakagawa and Osaki (1977) discuss the discrete version of this 
problem. Let U n = min {X n ,r} denote the time between the (n -l) st and n th replacement; 
assume a replacement has occurred at time 0. Let N(t) denote the number of replacements 
to occur in (0, /]; by the assumptions made thus far { N(t ); t > 0} is a counting process 
with times between events iid and is therefore a renewal process. Suppose the cost for 
replacement is K > 0 if replaced due to age (i.e., preventively) and (K+Q if replaced due 
to failure (assume C > 0; this indicates the costly nature of a replacement during 
operation). Let Z(t) denote the total cost incurred in (0, f]; {Z(t); t > 0} is a renewal 
reward process with inter-renewal times {(/„}, where U n = min{X„,r}, 

R n =K + CI[X n < T], and Z(t) = ^ N J } )R n . Ross (1997) proves that if £[i?i] <~and 
E[U{\ < the long-run average cost per unit of time in use is lim Z(t)/t = ]/£[(/, ] 

with probability 1. If we say a “cycle” is completed every time a replacement occurs, this 
limit is the “expected reward per cycle” over the “expected cycle length.” We now 
compute jET[/?i] and E[U{\. Since R l =K + C/[X, < t] , we find £T/?i] = K + C F(f). 

Since £/,|X, = X x I(X x < r) + tI(X , > t), we find E\U x ]=^tf{t)dt + zf{t)dt, which 

reduces to J S(u)du . Thus, the long-run average cost per unit of time in use as a 
function of fis (1.1). 

C. SINGLE-SCALE COST FUNCTION AND SCALE FAMILIES 

The following lemma shows that (1.1) behaves “as expected” in scale families. 
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Lemma A.l (Optimal Replacement Time Ordering in Scale Families): Let Z 

and Y denote lifetimes from distributions Fz and F y , respectively, where Z~ aY , with 

a > 0. Let K and C > 0. Let Tz* and Ty minimize (1.1) when F = Fz and Fy, respectively. 

$ * 

Then, Tz = a ty . 

Proof: Let T> 0. Then, by definition 

cM= E±ciM . 

£ S z (u)du 


It follows that 



/f + CF y (r/a) 
z |£ /a S Y (u /«)(!/«) 


a 


K + CF^v/a) 
J 0 T/ “ Sy{u)du 


= -C Y {r/a). 
a 


But then 

t z * = arg min C z (t) 

= arg min— C Y (r/a ) 
a 

= arg min C Y {x/a ) 

* 

— Cl Ty , 

where the last two equalities follow by observing that (1) minima are preserved under 
vertical shrinking, and (2) minima are scaled upon horizontal stretching. 
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APPENDIX B: DATASETS 


1. Automobile data. This data set consists of 19 failure times in days since purchase and 
number of miles driven (to the nearest 100 miles) for a particular automobile component. 
The data set is taken from Wilson (1993, p. 32). The data are presented in the table 
below. 


Failure 

| Days 

Miles 

1 

146 

3200 

2 

251 

11100 

3 

251 

11100 

4 

470 

14100 

5 

26 

8400 

6 

330 

8500 

7 

r- 

00 

6800 

8 

210 

9100 

9 

368 

6500 

10 

68 

1200 

11 

340 

11000 

12 

384 

12400 

13 

286 

8000 

14 

306 

10300 

15 

105 

1900 

16 

24 

1100 

17 

95 

2200 

18 

101 

4200 

19 

187 

2400 


Table B.1: Auto Data 
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2. Metal fatigue data. This data set was discussed in Kordonsky and Gertsbakh (1993, 
p. 240); a summary of their description of the data set follows. A sample of 30 identical 
steel specimens was divided into six groups of size five; each group was subjected to a 
cyclic two-level loading regime until failure. The loading regime for group j was a 
periodic sequence of 5000 loading cycles consisting of 5000ety cycles of small amplitude 
(i.e., low load) followed by 5000(1-^) cycles of large amplitude (i.e., high load),; = 

1,... ,6. The table below records the cumulative number of low cycles and high cycles at 
failure for each specimen, scaled by a factor of 10. 


Specimen 

OCj 

Low/10 

High/10 


Specimen 

OCj 

Low/10 

High/10 

1 

0.95 

PEEZ&CT 

1350 


16 

0.40 

3200 

4570 

2 


| 

1160 


17 




3 

MM 

W5EEM 

1925 


18 

gga 



4 

Esa 

mmuM 

1750 


19 

im 

4200 

6060 

5 


MTOsCT 

2000 





8040 

6 

0.80 

■EfelSCT 


■ 

21 

EESi 



7 

0.80 

■IrAVCT 


■ 

22 


mmm 


8 

0.80 



■ 

23 

EE9 

KSH 


9 

0.80 

15600 

jW 

■ 

24 

0.20 

1900 

7260 

10 

■SECT 



■ 

25 

0.20 

1100 

4200 

11 

EEJ 

■Si 


I 

26 

0.05 

300 

5390 

12 

■SECT 

■ftECT 



27 


375 

6855 

13 i 

0.60 

WEtm 


■ 

28 


425 

7795 

14 

0.60 

5700 

3730 


29 


332 

5795 

15 

0.60 

6600 

4270 


30 

0.05 

275 

5125 


Table B.2: Metal Data 
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3. Traction motor data. This data set comes from the railroad industry, and is found in 
Wilson (1993, p. 31). Table B.3 contains the time since inception of service and mileage 
at failure of forty locomotive traction motors when they were returned to the depot for 
maintenance. 


i 

miles 

days 


i 

miles 

days 

i 

9766 

166 

21 

5922 

128 

2 

2041 

35 

22 

1974 

31 

3 

12392 

249 

23 

2030 

65 

4 

9889 

190 

24 

12532 

221 

5 

974 

27 

25 

14796 

316 

6 

1594 

41 

26 

979 

22 

7 

2128 

59 

27 

15062 

261 

8 

2158 

75 

28 

2062 

32 

9 

11187 

223 

29 

16888 

397 

10 

47660 

952 

30 

3099 

48 

11 

13827 

335 

31 

28 

1 

12 

5992 

164 

32 

95 

27 

13 

6925 

145 

33 

12600 

295 

14 

7078 

170 

34 

8067 

140 

15 

7553 

140 

35 

41425 

827 

16 

25014 

498 

36 

105 

2 

17 

25380 

571 

37 

12302 

209 

18 

26433 

499 

38 

447 

29 

19 

16494 

340 

39 

9766 

166 

20 

7162 

160 

40 

57304 

1200 


Table B.3: Traction Motor Data 
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4. Jet engine failure data. This data set is discussed in Gertsbakh and Kordonsky (1998, 
p. 1186) and was obtained from the first author. Table B.4 contains the flight hours and 
number of landings at failure of 21 jet engines. 



Table B.4: Jet Engine Data 



















5. Pressure gauge data. The table below contains the failure (or censoring, if marked by 
an asterisk) time in hours of 15 pressure gauges and the corresponding covariate value 
“pressure.” The data set is from Love and Guo (1991, p. 14). The implication is that the 
value of the covariate was fixed during each particular life cycle. Thus, for example, the 
first entry indicates that “medium” (in some sense) pressures were measured from time 0 
until failure at 70 hours. 


i 

Time (hrs) 

Pressure 

i 

70 

4 

2 

53 

4 

3 

77 

4 

4 

42 

4 

5 

61* 

4 

6 

51 

5 

7 

70 

5 

8 

32 

5 

9 

47 

5 

10 

44* 

5 

11 

101 

3 

12 

66 

3 

13 

198 

3 

14 

95 

3 

15 

60* 

3 


Table B.5: Pressure Gauge Data 
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